Open dasch124 opened 9 months ago
Dear Daniel and Veronika, I tried to separate all the segments that contained different speakers and I added the speakers ID. I pushed the elan file on github. I hope it is fine now. The problems occurs when there are two people speaking at the same time and the voices overlap. In that case (I think it happens twice), I could not separate the segments and for the moment I left them together, even though this is also not a good solution. Before moving on solving this issue, I would first like to know if what I did until now looks good for you.
In several texts (e.g. Urfa-107_Cotton_Business) one ELAN segment contains utterances of several speakers. It would be good to separate those:
We can then transform this into
@who
attributes.If the original context should be restored, curators can afterwards add
<annotationBlock>
elements around the separated<u>
elements after tokenization.