dopefishh / pympi

A python module for processing ELAN and Praat annotation files
MIT License
93 stars 39 forks source link

KeyError when retrieving annotations #16

Closed GladB closed 5 years ago

GladB commented 5 years ago

I am trying to retrieve the annotations for each tier in an .eaf file, using Elan.py, and there seems to be an error for the lex@CHI tier in the file that I am trying to process (also happens with mwu@CHI in other files). When I look at self.tiers[self.annotations[ref]] in def get_ref_annotation_data_for_tier(self, id_tier), it tries to retrieve the first element of that structure, which is an empty dictionary for lex@CHI but not for other tiers. The second element seems to contain the data that I am looking for, but only for lex@CHI.

I am attaching the .eaf file and a python script that triggers this error. Is it something wrong with the file or is it the code?

Thank you!

issue.zip

dopefishh commented 5 years ago

Thanks for reporting. I'll try to find time this week.

On 6 May 2019 19:23:32 CEST, GladB notifications@github.com wrote:

I am trying to retrieve the annotations for each tier in an .eaf file, using Elan.py, and there seems to be an error for the lex@CHI tier in the file that I am trying to process (also happens with mwu@CHI in other files). When I look at self.tiers[self.annotations[ref]] in def get_ref_annotation_data_for_tier(self, id_tier), it tries to retrieve the first element of that structure, which is an empty dictionary for lex@CHI but not for other tiers. The second element seems to contain the data that I am looking for, but only for lex@CHI.

I am attaching the .eaf file and a python script that triggers this error. Is it something wrong with the file or is it the code?

Thank you!

issue.zip

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dopefishh/pympi/issues/16

dopefishh commented 5 years ago

Which version of python, when I run the script it does not crash

GladB commented 5 years ago

Whether I use python 2.7 or 3.7, I get the following error:

$ python3 test_pympi.py  ~/ACLEW_ReliabilityChecker/reliability_shiny_app/issue/orig_no_id.eaf

Parsing unknown version of ELAN spec... This could result in errors...
Traceback (most recent call last):
  File "test_pympi.py", line 11, in <module>
    annotations = EAF.get_annotation_data_for_tier(tier)
  File "/usr/local/lib/python3.7/site-packages/pympi/Elan.py", line 627, in get_annotation_data_for_tier
    return self.get_ref_annotation_data_for_tier(id_tier)
  File "/usr/local/lib/python3.7/site-packages/pympi/Elan.py", line 955, in get_ref_annotation_data_for_tier
    refann = self.tiers[self.annotations[ref]][0][ref]
KeyError: 'a913'

The key seems to be different at each run.

dopefishh commented 5 years ago

I cannot reproduce this. The script doesn't work, it is lacking imports and setdefaultencoding is deprecated anyway. To reproduce I did this:

git clone https://github.com/dopefishh/pympi
cd pympi
virtualenv -p python3 .
. bin/activate
python setup.py install
wget https://github.com/dopefishh/pympi/files/3149036/issue.zip
unzip issue.zip
cd issue
# change the file so that it compiles, i.e. remove the reload and defaultencoding line
python test_pympi.py orig_no_id.eaf

And it works fine

GladB commented 5 years ago

Indeed, the script I provided is running only with python 2, the reload and defaultencoding were removed for python 3, sorry about that. When running in the virtualenv just like you did, it does work, but I still have that issue when I run it outside of that environment. If it is only me, I guess I will fix it locally. Thanks anyway!

dopefishh commented 5 years ago

okay, I'll close it for now then. Feel free to reopen if you can reproduce the problem with the latest master