hbuschme / TextGridTools

Read, write, and manipulate Praat TextGrid files with Python
GNU General Public License v3.0
123 stars 30 forks source link

Overlapping annotation on a tier #9

Open hbuschme opened 7 years ago

hbuschme commented 7 years ago

Currently TextGridTools does not support overlapping annotations on a single Tier. In my opinion this behaviour is reasonable. Overlapping annotations do not make sense and cannot be represented in the TextGrid file format.

Recently, however, I came across an ELAN file (example file) with overlapping annotations. These cannot be created in ELAN, but ELAN is able to open them without a problem and preserves them when saving a file containing them. I was not even able to load the file using TextGridTools because we are very strict. My question now is whether we should add an option that relaxes this constraint when loading ELAN files (it could for example result in a warning and move the overlapping annotation boundary).

mwlodarczak commented 7 years ago

On 07/22/2017 06:03 PM, Hendrik Buschmeier wrote:

Currently TextGridTools does not support overlapping annotations on a single Tier. In my opinion this behaviour is reasonable. Overlapping annotations do not make sense and cannot be represented in the TextGrid file format.

Recently, however, I came across an ELAN file (example file) with overlapping annotations. These cannot be created in ELAN, but ELAN is able to open them without a problem and preserves them when saving a file containing them. I was not even able to load the file using TextGridTools because we are very strict. My question now is whether we should add an option that relaxes this constraint when loading ELAN files (it could for example result in a warning and move the overlapping annotation boundary).

I wonder whether there is any situation when overlaps within a tier are desirable and do not merely indicate corrupt input. Or are you saying that an option to ignore overlaps could be useful to fix these problems from within tgt?

The latter would actually make sense but adding this feature opens a number of issues. Most significantly, the question what to with the overlaps. I am afraid that simply moving the boundary of, say, the first (or second) of the overlapping annotations may not be flexible enough.

Of course, we could decide to produce a warning and leave it to the user to decide how to resolve the issues but allowing tiers to contain overlapping annotations leads to two more difficulties:

  1. They will most likely interfere with the search methods (get_nearest_annotation, etc.).

  2. Since we probably would not want tgt to allow writing such tiers to a file, we would need to add some checking mechanism before exporting.

An alternative, would be to simply emit an error message listing all overlaps and exit without creating a TextGrid object. The user could then open the file in ELAN and inspect the file.

Marcin