CoEDL / elpis

🙊 software for creating speech recognition models.
https://elpis.readthedocs.io/en/latest/
Apache License 2.0
152 stars 33 forks source link

input file options #341

Open HedvigS opened 1 year ago

HedvigS commented 1 year ago

I have Samoan data that I have archived with PARADISEC. As per their data design, the files that belong to the same event have the same names - save the last element. For example:

HSS01-20150822_NM73A-ELAN.eaf HSS01-20150822_NM73A-Tr1.WAV HSS01-20150822_NM73A-TrLR.WAV

is all about the same recording, it's one eaf-file and 2 wav-files.

I would like to feed all of my 58 eaf-files with accompanying wav-files into ELPIS to see how well it can do with that amount of data. I attended a training session with Daan and Nay recently, and got some through but I want to do it at a larger scale.

Is there a way of doing this that doesn't involve me changing filenames? Can I someone tell ELPIS "if two files have the same name save for the content after the last dash, then treat them as linked".