Open XCYu-0903 opened 1 year ago
有成功复现吗
我卡在配环境了,你们nccl报错了吗
I met similar question, I resample VCTK-DEMAND/test (48000hz) to 16000hz, and the result is
pesq: 1.2306799195634508 csig: 1.6080942775665945 cbak: 2.1193723316105366 covl: 1.4202725754636616 ssnr: 0.6998261689532145 stoi: 0.6101097034995405
and I use the default loss weight
parser.add_argument("--loss_weights", type=list, default=[0.1, 0.9, 0.2, 0.05],
I use the .ckpt
CMGAN_epoch_50_0.092
[0.3, 0.7, 0.2, 0.05]
I want to ask in the terminal,
python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.3,0.7,0.2,0.05
,and the bug is
Traceback (most recent call last): File "/home//CMGAN-main/src/train.py", line 355, in
main(args) File "/home/CMGAN-main/src/train.py", line 342, in main trainer.train() File "/home/CMGAN-main/src/train.py", line 294, in train loss, disc_loss = self.train_step(batch) File "/home/CMGAN-main/src/train.py", line 224, in train_step loss = self.calculate_generator_loss(generator_outputs) File "/home/CMGAN-main/src/train.py", line 178, in calculate_generator_loss args.loss_weights[0] * loss_ri TypeError: only integer tensors of a single element can be converted to an index,
I try to change the loss weight into
python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.1,0.9,0.2,0.05
according to
parser.add_argument("--loss_weights", type=list, default=[0.1, 0.9, 0.2, 0.05], help="weights of RI components, magnitude, time loss, and Metric Disc")
but the bug is still. I don't understand, why?
And if I remove the loss weight , the training is OK,like:
python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight
This means I can't specify the loss weight,right? May I know your way to input "--loss weight" ? Thanks.
[0.3, 0.7, 0.2, 0.05]
I want to ask in the terminal,
python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.3,0.7,0.2,0.05
,and the bug is
Traceback (most recent call last): File "/home//CMGAN-main/src/train.py", line 355, in main(args) File "/home/CMGAN-main/src/train.py", line 342, in main trainer.train() File "/home/CMGAN-main/src/train.py", line 294, in train loss, disc_loss = self.train_step(batch) File "/home/CMGAN-main/src/train.py", line 224, in train_step loss = self.calculate_generator_loss(generator_outputs) File "/home/CMGAN-main/src/train.py", line 178, in calculate_generator_loss args.loss_weights[0] * loss_ri TypeError: only integer tensors of a single element can be converted to an index,
I try to change the loss weight into
python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.1,0.9,0.2,0.05
according to
parser.add_argument("--loss_weights", type=list, default=[0.1, 0.9, 0.2, 0.05], help="weights of RI components, magnitude, time loss, and Metric Disc")
but the bug is still. I don't understand, why? And if I remove the loss weight , the training is OK,like:
python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight
This means I can't specify the loss weight,right? May I know your way to input "--loss weight" ? Thanks.
Change it within the script. Passing lists as arguments through the command line using argparse isn't recommended.
thanks. And it's ok.
Maybe you know how the author resample the wav into 16000hz. The .wav in the VCTK-DEMAND/test is 48000hz. I try to resample and use evaluation.py, and the result is very bad. I can't make out. The quality of generated .wav is too bad. And I use the .ckpt in the original project and the original dataset in the paper. So I think my way to resample is wrong.
evaluation.py
def evaluation(model_path, noisy_dir, clean_dir, save_tracks, saved_dir): clean_audio, sr = sf.read(clean_path) print("clean_audio",clean_audio) #是一个一维的 NumPy 数组 print("sr",sr) # sr 48000 clean_audio = librosa.resample(clean_audio, sr, 16000)
[0.3, 0.7, 0.2, 0.05]
I want to ask in the terminal,
python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.3,0.7,0.2,0.05
,and the bug is
Traceback (most recent call last): File "/home//CMGAN-main/src/train.py", line 355, in main(args) File "/home/CMGAN-main/src/train.py", line 342, in main trainer.train() File "/home/CMGAN-main/src/train.py", line 294, in train loss, disc_loss = self.train_step(batch) File "/home/CMGAN-main/src/train.py", line 224, in train_step loss = self.calculate_generator_loss(generator_outputs) File "/home/CMGAN-main/src/train.py", line 178, in calculate_generator_loss args.loss_weights[0] * loss_ri TypeError: only integer tensors of a single element can be converted to an index,
I try to change the loss weight into
python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight --loss_weights 0.1,0.9,0.2,0.05
according to
parser.add_argument("--loss_weights", type=list, default=[0.1, 0.9, 0.2, 0.05], help="weights of RI components, magnitude, time loss, and Metric Disc")
but the bug is still. I don't understand, why? And if I remove the loss weight , the training is OK,like:
python3 train.py --data_dir /home/CMGAN-main/VCTK-DEMAND --save_model_dir ./saved_model_0.3_0.7_0.2_0.05loss_weight
This means I can't specify the loss weight,right? May I know your way to input "--loss weight" ? Thanks.
Change it within the script. Passing lists as arguments through the command line using argparse isn't recommended.
Maybe you know how the author resample the wav into 16000hz. The .wav in the VCTK-DEMAND/test is 48000hz. I try to resample and use evaluation.py, and the result is very bad. I can't make out. The quality of generated .wav is too bad. And I use the .ckpt in the original project and the original dataset in the paper. So I think my way to resample is wrong.
evaluation.py
def evaluation(model_path, noisy_dir, clean_dir, save_tracks, saved_dir): clean_audio, sr = sf.read(clean_path) print("clean_audio",clean_audio) #是一个一维的 NumPy 数组 print("sr",sr) # sr 48000 clean_audio = librosa.resample(clean_audio, sr, 16000)
Hello! Your paper and codes are very enlightening to me and I tried to train the model from scratch on VCTK-DEMAND dataset to reproduce the results, but I found that the results are very very bad. PESQ and SSNR are merely 2.13 and 1.12, respectively. I don't modify the codes except for changing cut_len to 1.6 and batch_size to 2 to be suitable for my limited GPU. I don't know what errors are inside my codes. For hyper-parameters, I conducted experiments on both [0.3, 0.7, 1, 0.01] in paper and [0.1, 0.9, 0.2, 0.05] in github but results are similar. For inference, I changed variable length to 8. Looking forward to your reply~