Closed marisacasillas closed 5 years ago
actually, I think Okko is away until July 22. I would propose to remove WCE for now, or at least hide it in a section of the docs that is called "tools under development, use at own risk", because with Okko away, whoever wants to fix this would have to invest quite a bit of time to understand how to best do it.
Middy: Can you share the CogSciTutorial folder with me so I can try it out? If I just run sh estimateWCE.sh /vagrant/data/wav/ ~/my_WCE_output.txt where /vagrant/data/wav/ contains my wav files (syntax as specified in the launcher script example), everything works fine.
The issue is that the wav files are stereo, but the tool only supports mono signals (specification was lengths up to 5 mins + mono .wavs, but it seems to run for bit longer than 5 min signals as well). The error is not very transparent about the mono requirement, so I have to fix that and tool documentation when I get back to the lab (or, actually, this version will become obsolete if we move to Shreyas' new syllable estimator soon).
If you convert the wav files into mono using, e.g., Praat, the WCE tool will work.
Ah! Excellent to know. I will also pass this on in the instructions to workshop participants.
You could pre-convert the workshop demo files to mono and share them with the participants, but if they have already downloaded them, then that's another matter.
Also, note that the WCE tool doesn't know how to use the SAD information in the way it's currently used. Instead, it's just treating all the .wav contents as speech.
There's the "WCE_preprocess.sh" in "utils" that should be able to segment longer files into SAD-based utterances. However, the usage is bit tricky, because we agreed to use ACLEW file naming convetions to keep track of which .eaf file goes with which .wav etc.
WCE_preprocess.sh manual says the following:
usage: WCE_preprocess.sh [wav_folder] [eaf_folder] [language] [SAD] wav_folder The folder where to find the wav files (REQUIRED). eaf_folder The folder where to find the eaf files (REQUIRED). language The language of the transcription : english, spanish or tzeltal (REQUIRED). SAD The SAD used to detect speech: opensmileSad (DEFAULT), tocomboSad
Wav files have to follow ACLEW file naming conventions: COR_baby_yyyyyy_zzzzzz.wav where COR is the three-character ID of the corpus, baby is four-digit identifier of the baby, and yyyyyy and zzzzzz are 2/5min segment onsets and offsets in seconds measured from the beginning of the daylong file, e.g., BER_0396_005220_005340.wav
Eaf-files must be of form xxxx.eaf, where xxxx is the babyID corerresponding to the .wav files.
Would it be preferable to have a script that simply reads the SAD .rttm files, runs the WCE on each segment in the .rttms, and returns the results for those?
Added functionality to run WCE on .rttm SAD segments and get outputs in .rttm format. See the official documentation page. ~/launcher and ~/repos/WCE_VM/ need to be pulled from master for changes to take effect.
I created SAD and diarization files using the various tools listed on the main divime usage page. I noticed that WCE wasn't on that page, so I navigated to its separate page and tried the out-of-the-box command:
vagrant ssh -c "estimateWCE.sh data/CogSciTutorial/ data/CogSciTutorial/WCE_output.txt"
Here's what I get:
(flagging the divime team, but also @orasanen in particular)