leto19 / MultiMetricGANplusplus

5 stars 2 forks source link

issues with dnsmos score #1

Open Janeliao-C opened 4 months ago

Janeliao-C commented 4 months ago

Hello!

I am currently testing DNSMOS on my Chime eval/1 subset, but I am unable to achieve the same results as reported in your paper. I am using the same scripts as provided, so I suspect there might be an issue with the data I am using.

Could you kindly provide a few numbered samples from your test set so that I can compare and troubleshoot the data issues?

Thank you very much for your assistance!

Best regards,

Jane Liao

leto19 commented 3 months ago

Hi Jane,

Thank you for your interest in our work.

Results for the CHiME-7 UDASE challenge can be found here: https://www.chimechallenge.org/challenges/chime7/task2/results#audio-examples This page hosts numbered audio examples from all the challenge entries, including the precursor system to MultiMetricGAN+/+, which is denoted as CMGAN-FT in the challenge results. This should help you trouble shoot your data problems. If this is not sufficient, I should be able to provide direct test audio from MultiMetricGAN+/+ (I'm away from my normal workstation at the moment), so please let me know.

My other thought is that your problem might be sample rate related? I think DNSMOS only supports 16k, and perhaps the CHiME eval set is provided at 48k?

Again, thank you for your interest, George

Janeliao-C commented 3 months ago

Hello George,

I have processed the data according to the official JSON file using the original downloaded dataset, the script I used is also provided by the official sources. I have verified that the duration of the sliced data matches the parameters in the JSON file. However, using the same scripts as officially provided,my results show BAK as the highest among the three metrics at 3.57, with both SIG and OVR scoring below 3.0, whereas official metrics indicate that SIG_MOS is at 3.48. 

The data on the CHiME page is labeled as Sample x and does not include identifiers like S21_P47_99.wav, making it difficult for me to check the differences between the official data and mine. Could you please provide me with such data? I would greatly appreciate it!

Additionally, I have checked my CHiME data, and they are indeed all 16kHz.

Thank you so much!

P.S. The dog in your profile picture is very cute.

Best,

Jane

------------------ 原始邮件 ------------------ 发件人: "leto19/MultiMetricGANplusplus" @.>; 发送时间: 2024年6月19日(星期三) 上午9:08 @.>; @.**@.>; 主题: Re: [leto19/MultiMetricGANplusplus] issues with dnsmos score (Issue #1)

Hi Jane,

Thank you for your interest in our work.

Results for the CHiME-7 UDASE challenge can be found here: https://www.chimechallenge.org/challenges/chime7/task2/results#audio-examples This page hosts numbered audio examples from all the challenge entries, including the precursor system to MultiMetricGAN+/+, which is denoted as CMGAN-FT in the challenge results. This should help you trouble shoot your data problems. If this is not sufficient, I should be able to provide direct test audio from MultiMetricGAN+/+ (I'm away from my normal workstation at the moment), so please let me know.

My other thought is that your problem might be sample rate related? I think DNSMOS only supports 16k, and perhaps the CHiME eval set is provided at 48k?

Again, thank you for your interest, George

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

leto19 commented 3 months ago

Hi Jane,

I think you should be able to get the file name identifier from that page, for example: https://www.chimechallenge.org/challenges/chime7/task2/audio/input/S01_P02_8_output.wav https://www.chimechallenge.org/challenges/chime7/task2/audio/NB/S01_P02_8_output.wav https://www.chimechallenge.org/challenges/chime7/task2/audio/CMGAN-FT/S01_P02_8_output.wav I got these from right clicking on the audio player object and clicking 'Open Audio in New Tab' (Google Chrome on OSX). The filename also reveals itself upon download.

I've also uploaded some audio from MultiMetricGAN+/+ to this google drive folder: https://drive.google.com/drive/folders/1t8lJS6rMC8tQ8jFVuMPZu-N46hmkoTz5?usp=sharing These are from the SIG/BAK/OVR model in the paper.

George