voicesauce / opensauce-python

Voice analysis software (Python port of VoiceSauce)
Apache License 2.0
55 stars 16 forks source link

Snack formant estimates differ drastically between Windows executable and Tcl shell #27

Open terriyu opened 7 years ago

terriyu commented 7 years ago

There are two main ways to run Snack commands. One way is to use the Windows standalone executable from VoiceSauce. The other way is to call the Snack command through the Tcl shell. Even when I use the same input sound file and same parameters to call the Snack formant command, I get very different results depending on whether I use the Windows executable or call Snack through the Tcl shell. There are discrepancies in the numbers which are greater than 10%.

terriyu commented 7 years ago

Comparison between using Tcl shell (blue) and Windows executable (red)

beijing_f3_50_a beijing_m5_17_c cant_f6_40_b cant_f6_95_b hmong_f4_18_b hmong_f4_24_d hmong_m6_24_c

terriyu commented 7 years ago

I'm told by the maintainers of the VoiceSauce from UCLA, that the Windows executable does not actually apply all the parameters that are passed to it. It's likely that the Windows executable performs the Snack calculations using some unknown default parameters. Since we don't understand what the Windows executable is doing, we are making the default on Windows to call Snack from the Tcl shell instead of using the standalone executable. See commit https://github.com/voicesauce/opensauce-python/commit/bf07dfb30188ab6ad6a7cac9db63c3833f7dbf58.