google / visqol

Perceptual Quality Estimator for speech and audio
Apache License 2.0
641 stars 118 forks source link

Does the test speech need to have the same length as the reference speech? #71

Closed ZhengRachel closed 1 year ago

ZhengRachel commented 1 year ago

Hi, I would like to know if visqol can be used when the length of test speech is differernt from that of reference speech ?

mchinen commented 1 year ago

Hi, They don't have to be exactly the same, and can have a small amount of lag (~1 second). So they should have fairly similar durations.

ZhengRachel commented 1 year ago

Does this mean that if the length difference between reference speech and test speech is more than 1 second, then visqol will not work, or the results it gives will no longer make sense ?

mchinen commented 1 year ago

If the length differs more than a second, you'll see a warning message. If the overlapping aligned region is long enough (3-10s) then it can still produce useful results, but you should inspect the alignment. The non-overlapping sections will not be compared, which may be problematic, depending on your use case. Using --verbose will show you how the file is aligned.

ZhengRachel commented 1 year ago

Thanks for your reply. I will have a try :)