signon-project / wp4-elan-processor

GNU General Public License v3.0
0 stars 0 forks source link

Error when running a folder of .eaf files while specifying no #1

Closed euan-mcgill closed 7 months ago

euan-mcgill commented 8 months ago

I'm attempting to parse the ELAN version of NCSLGR (ASL) corpus, as found at http://asl.cs.depaul.edu/corpus/index.html

However, I encounter an error where the script seems to want to generate a folder inside a text file

python elan_processor.py -i ~/Documents/corpora/NCSLGR/elanBUcorpus/ -o ~/Documents/corpora/NCSLGR/test/ --video_mode 4
[*] Starting to parse ELAN files
[-] 870 elan files were detected in /home/upf/Documents/corpora/NCSLGR/elanBUcorpus/
100%|█████████████████████████████████████████▉| 869/870 [00:04<00:00, 204.36it/s]

[*] Starting annotation aligment
[-] No leading modelity was not specified to align the annotations or the specified is not valid.
Please select a valid one:
     (1) POS2
     (2) body lean
     (3) head pos:jut
     (4) head pos: tilt side
     (5) English translation
     (6) neck
     (7) adverbial
     (8) head pos: jut
     (9) negative
     (10) head pos: turn
     (11) body mvmt
     (12) head pos: tilt fr/bk
     (13) literal translation
     (14) wh question
     (15) non-dominant hand gloss
     (16) head mvmt: jut
     (17) nondominant POS
     (18) rhetorical question
     (19) non-dominant POS
     (20) nose
     (21) shoulders
     (22) eye brows
     (23) role shift
     (24) head mvmt: side to side
     (25) main gloss
     (26) head mvmt: nod
     (27) eye gaze
     (28) topic/focus
     (29) Non-dominant POS
     (30) yes-no question
     (31) relative clause
     (32) conditional/when
     (33) eye aperture
     (34) head mvmt: shake
     (35) cheeks
     (36) mouthing
     (37) POS
     (38) mouth
5
[-] The aligment is going to peformed using English translation as leading modality
 19%|████████                                  | 166/871 [00:00<00:03, 216.57it/s]
Traceback (most recent call last):
  File "/home/upf/Documents/git/wp4-elan-processor/elan_processor.py", line 133, in <module>
    main()
  File "/home/upf/Documents/git/wp4-elan-processor/elan_processor.py", line 93, in main
    align_annotations(output_folder, leading_modality) # ALINGING ANNOTATIONS USING TIMESTAMPS
  File "/home/upf/Documents/git/wp4-elan-processor/utils.py", line 531, in align_annotations
    txt_files = [os.path.join(ann_path, fn) for fn in os.listdir(ann_path) if '.txt' in fn] 
NotADirectoryError: [Errno 20] Not a directory: '/home/upf/Documents/corpora/NCSLGR/test/parsing_report.txt/annotations'
SantiagoEG commented 8 months ago

Hi @euan-mcgill, the software is treating"parsing_report.txt" as directory, while it is actually a file. I did not have this issue before... I think it happens because you are using relative paths... I recommend you to have the script and your dataset directories in the same place. I hope it helps!