srvk / DiViMe

ACLEW Diarization Virtual Machine
Apache License 2.0
32 stars 9 forks source link

Out-of-the-box WCE tool not running #139

Closed marisacasillas closed 5 years ago

marisacasillas commented 5 years ago

I created SAD and diarization files using the various tools listed on the main divime usage page. I noticed that WCE wasn't on that page, so I navigated to its separate page and tried the out-of-the-box command: vagrant ssh -c "estimateWCE.sh data/CogSciTutorial/ data/CogSciTutorial/WCE_output.txt"

Here's what I get:

Running WCE module (this might take a while...)
Error using vertcat
Dimensions of matrices being concatenated are not consistent.

Error in haeMelPiirteet (line 42)

Error in LSTMseg (line 38)

Error in getSyllables (line 57)

Error in WCEestimate (line 89)

MATLAB:catenate:dimensionMismatch
paste: /vagrant/data/CogSciTutorial/WCE_output.txt: No such file or directory
WCE processing complete. Wrote output to /vagrant/data/CogSciTutorial/WCE_output.txt
Connection to 127.0.0.1 closed.

(flagging the divime team, but also @orasanen in particular)

alecristia commented 5 years ago

actually, I think Okko is away until July 22. I would propose to remove WCE for now, or at least hide it in a section of the docs that is called "tools under development, use at own risk", because with Okko away, whoever wants to fix this would have to invest quite a bit of time to understand how to best do it.

orasanen commented 5 years ago

Middy: Can you share the CogSciTutorial folder with me so I can try it out? If I just run sh estimateWCE.sh /vagrant/data/wav/ ~/my_WCE_output.txt where /vagrant/data/wav/ contains my wav files (syntax as specified in the launcher script example), everything works fine.

marisacasillas commented 5 years ago

https://github.com/aclew/DaylongDataTutorial-CogSci19/tree/master/divime-demo/tool-runs

orasanen commented 5 years ago

The issue is that the wav files are stereo, but the tool only supports mono signals (specification was lengths up to 5 mins + mono .wavs, but it seems to run for bit longer than 5 min signals as well). The error is not very transparent about the mono requirement, so I have to fix that and tool documentation when I get back to the lab (or, actually, this version will become obsolete if we move to Shreyas' new syllable estimator soon).

If you convert the wav files into mono using, e.g., Praat, the WCE tool will work.

marisacasillas commented 5 years ago

Ah! Excellent to know. I will also pass this on in the instructions to workshop participants.

orasanen commented 5 years ago

You could pre-convert the workshop demo files to mono and share them with the participants, but if they have already downloaded them, then that's another matter.

Also, note that the WCE tool doesn't know how to use the SAD information in the way it's currently used. Instead, it's just treating all the .wav contents as speech.

There's the "WCE_preprocess.sh" in "utils" that should be able to segment longer files into SAD-based utterances. However, the usage is bit tricky, because we agreed to use ACLEW file naming convetions to keep track of which .eaf file goes with which .wav etc.

WCE_preprocess.sh manual says the following:

usage: WCE_preprocess.sh [wav_folder] [eaf_folder] [language] [SAD] wav_folder The folder where to find the wav files (REQUIRED). eaf_folder The folder where to find the eaf files (REQUIRED). language The language of the transcription : english, spanish or tzeltal (REQUIRED). SAD The SAD used to detect speech: opensmileSad (DEFAULT), tocomboSad

Wav files have to follow ACLEW file naming conventions: COR_baby_yyyyyy_zzzzzz.wav where COR is the three-character ID of the corpus, baby is four-digit identifier of the baby, and yyyyyy and zzzzzz are 2/5min segment onsets and offsets in seconds measured from the beginning of the daylong file, e.g., BER_0396_005220_005340.wav

Eaf-files must be of form xxxx.eaf, where xxxx is the babyID corerresponding to the .wav files.

Would it be preferable to have a script that simply reads the SAD .rttm files, runs the WCE on each segment in the .rttms, and returns the results for those?

orasanen commented 5 years ago

Added functionality to run WCE on .rttm SAD segments and get outputs in .rttm format. See the official documentation page. ~/launcher and ~/repos/WCE_VM/ need to be pulled from master for changes to take effect.