google / visqol

Perceptual Quality Estimator for speech and audio
Apache License 2.0
641 stars 118 forks source link

MOS-LQO results are low in speech mode #92

Open hoantv93 opened 1 year ago

hoantv93 commented 1 year ago

We tried to apply VISQOL in the audio quality evaluation of a security camera device. Here is our recording process: Human voice -> Recorded by high-quality microphone (48kHz, 16bit, mono) -> Resample (16kHz, 16bit, mono) -> reference audio (REF.MONO.16KHZ.VOICE.01.wav) Human voice -> Recorded by camera's microphone -> Resample (16kHz, 16bit, mono) -> degraded audio (DEG.MONO.16KHZ.VOICE.01.wav) VISQOL command: visqol --reference_file REF.MONO.16KHZ.VOICE.01.wav --degraded_file DEG.MONO.16KHZ.VOICE.01.wav --verbose --use_speech_mode Return MOS is 1.64007 (lower than our expected) But, MOS is 3.41819 when used in audio mode.

Our test method is ok or not? What we need to do to improve MOS results in speech mode? Audio files

Yutong-gannis commented 3 weeks ago

did you already solve this problem?