google / visqol

Perceptual Quality Estimator for speech and audio
Apache License 2.0
641 stars 118 forks source link

speech mode: Patch 0 is always the beginning of the file, even when there is no voice activity #74

Open mchinen opened 1 year ago

mchinen commented 1 year ago

Example output from files with -50db silence in the first second:


Patch Idx Similarity Ref Patch: Start - End Deg Patch: Start - End
0 1.000000 0.186 - 0.580 0.580 - 0.974
1 0.384118 1.380 - 1.780 1.420 - 1.820
2 0.522233 2.180 - 2.580 2.180 - 2.580
3 0.742840 6.180 - 6.580 6.180 - 6.580
4 0.688050 10.180 - 10.576 10.184 - 10.580
5 0.905596 10.580 - 10.980 10.580 - 10.980