Closed macramole closed 3 years ago
What happens when you validate the eaf file from Elan? So File -> Validate EAF File
It might be that ELAN changed the EAF format for newer versions. The XML scheme should be defined in the header.
@sarpu this errorss might be relevant:
ERROR: tier "code" has no parent tier but has stereotype CONSTRAINT "Symbolic_Association" defined in its linguistic type "dependency" ERROR: the tier "code" contains 15 alignable annotations not consistent with tier stereotype "Symbolic_Association" Checking tier: code_num ERROR: tier "code_num" has no parent tier but has stereotype CONSTRAINT "Symbolic_Association" defined in its linguistic type "dependency" ERROR: the tier "code_num" contains 15 alignable annotations not consistent with tier stereotype "Symbolic_Association" Checking tier: on_off ERROR: tier "on_off" has no parent tier but has stereotype CONSTRAINT "Symbolic_Association" defined in its linguistic type "dependency" ERROR: the tier "on_off" contains 15 alignable annotations not consistent with tier stereotype "Symbolic_Association" Checking tier: context ERROR: tier "context" has no parent tier but has stereotype CONSTRAINT "Symbolic_Association" defined in its linguistic type "dependency" ERROR: the tier "context" contains 15 alignable annotations not consistent with tier stereotype "Symbolic_Association" Checking tier: note ERROR: tier "note" has no parent tier but has stereotype CONSTRAINT "Symbolic_Association" defined in its linguistic type "dependency" ERROR: the tier "note" contains 15 alignable annotations not consistent with tier stereotype "Symbolic_Association" There are tier-type/tier-hierarchy inconsistencies. Please refer to the EAF format documentation:
@dopefishh yes, I've used this script with ELAN 5.1 and it used to work. Our annotators can't use that version anymore because of a Java version problem
It seems that they indeed upgraded to XML scheme version 3.0, a warning is probably emitted when reading these files. The changes to this need to be implemented. I could definitely use help for this.
@dopefishh I can take a shot at this. Is there a sample 3.0 file similar to the ones for 2.7 and 2.8 under examples?
@sarpu: Thanks, I'll be happy to accept a PR
the old scheme is available here: http://www.mpi.nl/tools/elan/EAFv2.8.xsd
the scheme is available here: http://www.mpi.nl/tools/elan/EAFv3.0.xsd
A human readable explanation is available here: https://www.mpi.nl/tools/elan/EAF_Annotation_Format_3.0_and_ELAN.pdf
the old scheme's human readable explanation is available here: https://www.mpi.nl/tools/elan/EAF_Annotation_Format_2.8_and_ELAN.pdf
So I am digging around the code and I don't think this issue is related to EAF 3.0 update. In the add_tier function, the code picks up the first available linguistic type if one is not specified. @macramole doesn't specify a linguistic type when adding a tier, so the code picks up dependency
:
The code
tier, through its automatically picked linguistic type dependency
, has the constraint "Symbolic Association", which, by both EAF 2.8 and 3.0 standards, means that the code
tier can only include reference annotations. But @macramole is using the add_annotation
function, which adds an aligned annotation as opposed to a reference annotation as required by the linguistic type of the code
tier, which is incorrect by both 2.8 and 3.0. So that is the reason why ELAN is probably deleting the time references.
So all of this to say that I think the code should simply raise an exception if adding an aligned annotation to a tier of a linguistic type with a constraint that requires reference annotations instead of aligned annotations (since the two cannot be mixed in a tier). I will submit a PR to that effect, since I looked over both 2.8 and 3.0 and the changes don't seem likely to cause something like this. What do you think @dopefishh ?
And @macramole, if you specifiy a linguistic type that allows for aligned annotations in your call to the add_tier
function, the code should work I believe.
@macramole Can you verify that it works now with the merged MR?
yes I will report here. what should I put as "linguistic type" so it works as I want ?
First create a linguistic type like eaf.add_linguistic_type('custom')
(custom is an arbitrary name I picked for a linguistic type, it can be anything you want), then, for all of the add_tier
calls, modify them as add_tier('code', ling='custom')
(again, instead of custom, use the linguistic type you just created above).
@macramole Did you manage to resolve the issue?
Hi, I'm trying to add some tiers and not-overlapping segments to my EAF file.
I'm using the following code:
The EAF files are created and I can open them with ELAN 5.9. I can see selected segments and everything seems to be working fine.
The problem is when I add a new segment from ELAN and save, the file gets corrupted and cannot be opened any more.
Examining the EAF file I can see that for instance these lines:
become:
TIME_SLOT_REF1 and 2 are empty! :(
Original EAF files where created using chat2elan from CLAN project. Opening and editing this EAF files using ELAN 5.9 works just fine.
System information
pip install
version and also cloning this repo