srvk / DiViMe

ACLEW Diarization Virtual Machine
Apache License 2.0
32 stars 9 forks source link

eval.sh and create_ref_sys.sh disagree on tool names #69

Closed riebling closed 6 years ago

riebling commented 6 years ago

In testing the test (more!) I came across this situation: I run eval.sh to get the usage, showing:

Usage: eval.sh <data> <system> <<optionalSAD>>
where data is the folder containing the data
and system is the system you want
to evaluate. Choices are:
  ldc_sad
  noisemes
  tocombosad
  opensmile
  diartk
  yunitate
If evaluating diartk, please give which flavour
of SAD you used to produce the diartk transcription
you want to evaluate

But then when I actually run it, and somewhere I didn't notice the output(?) it seems that an enclosed script create_ref_sys.sh is necessary to create reference .lab files for the scoring tool(s) to use. But it (internally, and does not list names via it's usage, just talks about "model prefix" which confuses me) is looking for different names:

if ! [[ $model_prefix =~ ^(ldc_sad|noisemes_sad|tocombo_sad|opensmile_sad|lena_sad|
                            diartk_ldcSad|diartk_noisemesSad|diartk_tocomboSad|diartk_opensmileSad|
                            diartk_goldSad|goldSad|yunitator|lena)$ ]]; then
    echo "You're trying to create folder containing the reference transcriptions, and the predicted ones."
    echo "However, you specified a wrong tool name."

Now that I re-read the usage for eval.sh I'm not sure the usage "If evaluating diartk, please give which flavour of SAD you used to produce the diartk transcription you want to evaluate" - where and how should a person do this, maybe by example?" is asking me to do. Do I create a compound name like diartk_noisemes maybe? Or maybe this is where "system name" comes into play, with names like "diartk_noisemesSad", "diartk_tocomboSad", "diartk_opensmile_Sad" etc. as specified farther down inside create_ref_sys.sh and evalDiar.sh and evalSAD.sh

So two points, really: the usage is confusing, and in the case of eval.sh when I give it noisemes, it fails to produce reference transcriptions because it wants noisemes_sad

AHA, it's starting to make more sense: inside of evalSAD.sh it maps it's arguments such as ldc_sad, noisemes, tocombo_sad, opensmile, lena_sad into the variable sys_name. This is an intermediate system name understood by create_ref_sys.sh, but lost in translation, because create_ref_sys.sh wants system names to have capitalization, as in the error message aboveldcSad`, etc. The changing of case names makes this really confusing (and in fact broken, I think?)

riebling commented 6 years ago

wait wait, it's making slightly more sense now. (the lack of examples and incomplete usage statements had me playing guessing names with case and concatenating _sad arbitrarily). The <<optionalSAD>> is crucial: this New Concept uses different tool names, am I right? OptionalSAD takes tool names with capitalizations in them? maybe? Or tool names with _sad in them? Oh heck, back to being confused again.

riebling commented 6 years ago

Ok fixed the bug! In create_ref_sys.sh it was creating reference files with .rttm.lab filenames. These were breaking the evaluation later on. The resulting behavior was that the code that tests for existence of ref and sys .lab files was not finding them (names were wrong) and ALWAYS outputting scores of "75.00% 0.00% 100.00%" rather than running score.py

MarvinLvn commented 6 years ago

My bad. I should have been more careful about the .lab files. I wrote this script to refactor the code that was common between evalDiar.sh and evalSAD.sh Now that you fixed that, I think the error messages should be clear enough and everything should work well.

The non-clarity comes from the model naming convention which is not consistent with itself + not consistent with the output rttm files. We need to improve that. :/