ruizhecao96 / CMGAN

Conformer-based Metric GAN for speech enhancement
MIT License
309 stars 60 forks source link

Inferior results trained from scratch #35

Open XCYu-0903 opened 1 year ago

XCYu-0903 commented 1 year ago

Hello! Your paper and codes are very enlightening to me and I tried to train the model from scratch on VCTK-DEMAND dataset to reproduce the results, but I found that the results are very very bad. PESQ and SSNR are merely 2.13 and 1.12, respectively. I don't modify the codes except for changing cut_len to 1.6 and batch_size to 2 to be suitable for my limited GPU. I don't know what errors are inside my codes. For hyper-parameters, I conducted experiments on both [0.3, 0.7, 1, 0.01] in paper and [0.1, 0.9, 0.2, 0.05] in github but results are similar. For inference, I changed variable length to 8. Looking forward to your reply~

Siaiai commented 11 months ago

有成功复现吗

moshengmao commented 11 months ago

我卡在配环境了,你们nccl报错了吗

moshengmao commented 11 months ago

I met similar question, I resample VCTK-DEMAND/test (48000hz) to 16000hz, and the result is

pesq: 1.2306799195634508 csig: 1.6080942775665945 cbak: 2.1193723316105366 covl: 1.4202725754636616 ssnr: 0.6998261689532145 stoi: 0.6101097034995405

and I use the default loss weight

parser.add_argument("--loss_weights", type=list, default=[0.1, 0.9, 0.2, 0.05],

I use the .ckpt

CMGAN_epoch_50_0.092

moshengmao commented 11 months ago

[0.3, 0.7, 0.2, 0.05]

I want to ask in the terminal,

python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.3,0.7,0.2,0.05

,and the bug is

Traceback (most recent call last): File "/home//CMGAN-main/src/train.py", line 355, in main(args) File "/home/CMGAN-main/src/train.py", line 342, in main trainer.train() File "/home/CMGAN-main/src/train.py", line 294, in train loss, disc_loss = self.train_step(batch) File "/home/CMGAN-main/src/train.py", line 224, in train_step loss = self.calculate_generator_loss(generator_outputs) File "/home/CMGAN-main/src/train.py", line 178, in calculate_generator_loss args.loss_weights[0] * loss_ri TypeError: only integer tensors of a single element can be converted to an index,

I try to change the loss weight into

python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.1,0.9,0.2,0.05

according to

parser.add_argument("--loss_weights", type=list, default=[0.1, 0.9, 0.2, 0.05], help="weights of RI components, magnitude, time loss, and Metric Disc")

but the bug is still. I don't understand, why?

And if I remove the loss weight , the training is OK,like:

python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight

This means I can't specify the loss weight,right? May I know your way to input "--loss weight" ? Thanks.

moshengmao commented 11 months ago

[0.3, 0.7, 0.2, 0.05]

I want to ask in the terminal,

python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.3,0.7,0.2,0.05

,and the bug is

Traceback (most recent call last): File "/home//CMGAN-main/src/train.py", line 355, in main(args) File "/home/CMGAN-main/src/train.py", line 342, in main trainer.train() File "/home/CMGAN-main/src/train.py", line 294, in train loss, disc_loss = self.train_step(batch) File "/home/CMGAN-main/src/train.py", line 224, in train_step loss = self.calculate_generator_loss(generator_outputs) File "/home/CMGAN-main/src/train.py", line 178, in calculate_generator_loss args.loss_weights[0] * loss_ri TypeError: only integer tensors of a single element can be converted to an index,

I try to change the loss weight into

python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.1,0.9,0.2,0.05

according to

parser.add_argument("--loss_weights", type=list, default=[0.1, 0.9, 0.2, 0.05], help="weights of RI components, magnitude, time loss, and Metric Disc")

but the bug is still. I don't understand, why? And if I remove the loss weight , the training is OK,like:

python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight

This means I can't specify the loss weight,right? May I know your way to input "--loss weight" ? Thanks.

Change it within the script. Passing lists as arguments through the command line using argparse isn't recommended.

thanks. And it's ok.

moshengmao commented 11 months ago

Maybe you know how the author resample the wav into 16000hz. The .wav in the VCTK-DEMAND/test is 48000hz. I try to resample and use evaluation.py, and the result is very bad. I can't make out. The quality of generated .wav is too bad. And I use the .ckpt in the original project and the original dataset in the paper. So I think my way to resample is wrong.

evaluation.py
def evaluation(model_path, noisy_dir, clean_dir, save_tracks, saved_dir): clean_audio, sr = sf.read(clean_path) print("clean_audio",clean_audio) #是一个一维的 NumPy 数组 print("sr",sr) # sr 48000 clean_audio = librosa.resample(clean_audio, sr, 16000)

moshengmao commented 11 months ago

[0.3, 0.7, 0.2, 0.05]

I want to ask in the terminal,

python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.3,0.7,0.2,0.05

,and the bug is

Traceback (most recent call last): File "/home//CMGAN-main/src/train.py", line 355, in main(args) File "/home/CMGAN-main/src/train.py", line 342, in main trainer.train() File "/home/CMGAN-main/src/train.py", line 294, in train loss, disc_loss = self.train_step(batch) File "/home/CMGAN-main/src/train.py", line 224, in train_step loss = self.calculate_generator_loss(generator_outputs) File "/home/CMGAN-main/src/train.py", line 178, in calculate_generator_loss args.loss_weights[0] * loss_ri TypeError: only integer tensors of a single element can be converted to an index,

I try to change the loss weight into

python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.1,0.9,0.2,0.05

according to

parser.add_argument("--loss_weights", type=list, default=[0.1, 0.9, 0.2, 0.05], help="weights of RI components, magnitude, time loss, and Metric Disc")

but the bug is still. I don't understand, why? And if I remove the loss weight , the training is OK,like:

python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight

This means I can't specify the loss weight,right? May I know your way to input "--loss weight" ? Thanks.

Change it within the script. Passing lists as arguments through the command line using argparse isn't recommended.

Maybe you know how the author resample the wav into 16000hz. The .wav in the VCTK-DEMAND/test is 48000hz. I try to resample and use evaluation.py, and the result is very bad. I can't make out. The quality of generated .wav is too bad. And I use the .ckpt in the original project and the original dataset in the paper. So I think my way to resample is wrong.

evaluation.py
def evaluation(model_path, noisy_dir, clean_dir, save_tracks, saved_dir): clean_audio, sr = sf.read(clean_path) print("clean_audio",clean_audio) #是一个一维的 NumPy 数组 print("sr",sr) # sr 48000 clean_audio = librosa.resample(clean_audio, sr, 16000)