xinjli / allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
GNU General Public License v3.0
532 stars 85 forks source link

Deterministic output #21

Closed eremingt closed 3 years ago

eremingt commented 3 years ago

I noticed that there is some variability in the output from call to call. For example, I just ran the same 15 second sample 10 times and the output contained varying numbers of phones:

[197, 198, 200, 199, 196, 195, 203, 195, 198, 197]

Is it possible to configure/modify the code slightly to generate deterministic results? I'm not sure, but I suspect this has something to do with Torch.

xinjli commented 3 years ago

Hi,

The random effects are actually coming from a preprocessing step called dithering, which is related to quantization errors. It is not that important in the inference stage, so I just updated in the latest version to disable this preprocess.

If you upgrade to the latest version, the results should be deterministic now.

eremingt commented 3 years ago

Awesome, closing! New results from same file:

[190, 190, 190, 190, 190, 190, 190, 190, 190, 190]

Consistent and slightly lower number of identified phones than before.