Closed gabrielmittag closed 3 years ago
Thanks for the report. That error message is incorrect (I have a patch to fix it) but ViSQOL is still not getting enough samples for some reason. If the input files are somewhat aligned speech >= 3 seconds, perhaps this indicates a bug in the alignment step (484 samples is not long enough to build a single frame, and not enough for ViSQOL to comment on quality).
Can you confirm that the input lengths of both degraded and reference and that they have the same utterance? Are you able to share the files by any chance? It would help for debugging if the files are valid.
Thanks for your answer. I uploaded two sets of degraded and reference files. The first one gives the error in audio mode and the second one in speech mode.
Thank you, this seems like an interesting case. I will be looking at it.
I noticed where this was failing and submitted a PR for it here: https://github.com/google/visqol/pull/34
I closed that PR in favour of an alternate fix being rolled out.
This should be fixed now.
I obtain this error for various files although the reference files are long enough and can for example be predicted with POLQA.