vocalpy / crowsetta

A tool to work with any format for annotating vocalizations
https://crowsetta.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
49 stars 3 forks source link

ENH: Add support for TextGrid files in binary format #242

Open NickleDave opened 1 year ago

NickleDave commented 1 year ago

We can parse TextGrid files saved in text but not binary format.
I do find some implementations that handle binary but looks like it might require some careful thinking about how to handle edge cases of low-level stuff I don't know well.

# Header
b'ooBinaryFile\x08TextGrid'         # where \x08 = len('TextGrid')
grid_xmin : double
grid_xmax : double
exists : bool
tiers : int

# Per Tier
str_len : byte
tier_type : str_len * byte
str_len : int
tier_name : str_len * byte
(tier_xmin : double)                 # discarded
(tier_xmax : double)                 # discarded
elements : int

# Per Point
xpos : double
str_len : short
str_len != -1:
    text : str_len * byte
else:
    # discard the -1
    str_len : short
    text : str_len * byte

# Per Interval
xmin : double
xmax : double
srt_len : short
str_len != -1:
    text : str_len * Byte
else:
    # discard the -1
    str_len short
    text : str_len * Byte