Open mnabihali opened 4 years ago
Hello, have you solved your problem?I have the same problem with you.
No, but I remove this assertion and the required scores I calculated it manually (signal by signal comparison this will not raise an error) if you find another solution please tell me.
On Friday, July 17, 2020, Lerry123 notifications@github.com wrote:
Hello, have you solved your problem?I have the same problem with you.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-659956915, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBEALJCEKHEKVHUCQDDR4ADMJANCNFSM4NNNL46A .
-- Mohamed Nabih Ali *Assistant *Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: mohmed.nabih@gmail.com mohmed.nabih@gmail.com Mobile: +201285659213
Work: 02-33318417
Are you Chinese? We can chat on qq. My English is not good.
I find this code do the padding in the mixture,but the clean and enhancement don't do the padding. Which database do you use and how about the result?
Dear, Sorry I am not Chinese. Regarding the dataset I am using VCTK dataset and a noisy version of librispeech dataset.
Sorry Can I ask which part is doning padding for mixture?
Thanks
You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-660410721, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBALDNQ6YC2AOEM5US3R4EC5ZANCNFSM4NNNL46A .
-- Mohamed Nabih Ali *Assistant *Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: mohmed.nabih@gmail.com mohmed.nabih@gmail.com Mobile: +201285659213
Work: 02-33318417
Do you find another solution for the problem?
On Saturday, July 18, 2020, Mohamed Nabih mohmed.nabih@gmail.com wrote:
Dear, Sorry I am not Chinese. Regarding the dataset I am using VCTK dataset and a noisy version of librispeech dataset.
Sorry Can I ask which part is doning padding for mixture?
Thanks
You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-660410721, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBALDNQ6YC2AOEM5US3R4EC5ZANCNFSM4NNNL46A .
-- Mohamed Nabih Ali *Assistant *Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: mohmed.nabih@gmail.com mohmed.nabih@gmail.com Mobile: +201285659213
Work: 02-33318417
-- Mohamed Nabih Ali *Assistant *Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: mohmed.nabih@gmail.com mohmed.nabih@gmail.com Mobile: +201285659213
Work: 02-33318417
I am also using the VCTK database,I haven't found a solution yet.If I find the solution,I will tell you.
I have padded for the clean、enhanced and the mixture and this problem is solved,but I have the new problem. When computed STOI, it 's error.The detail is as follow. AttributeError: module 'numpy' has no attribute 'gcd' I haven't found the solution and I only computed the PESQ.
#stoi_c_n.append(compute_STOI(clean, mixture, sr=16000))
#stoi_c_e.append(compute_STOI(clean, enhanced, sr=16000))
pesq_c_n.append(compute_PESQ(clean, mixture, sr=16000))
pesq_c_e.append(compute_PESQ(clean, enhanced, sr=16000))
Did you have the same problem?
Can you provide me the code how you padding the signals in order to try computing the Stoi
On Monday, July 20, 2020, Lerry123 notifications@github.com wrote:
I have padded for the clean、enhanced and the mixture and this problem is solved,but I have the new problem. When computed STOI, it 's error.The detail is as follow. AttributeError: module 'numpy' has no attribute 'gcd' I haven't found the solution and I only computed the PESQ.
Metric
stoi_c_n.append(compute_STOI(clean, mixture, sr=16000))
stoi_c_e.append(compute_STOI(clean, enhanced, sr=16000))
pesq_c_n.append(compute_PESQ(clean, mixture, sr=16000)) pesq_c_e.append(compute_PESQ(clean, enhanced, sr=16000)) Did you have the same problem?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-660870481, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBGK3Y57CFGKMS6C6F3R4P23FANCNFSM4NNNL46A .
-- Mohamed Nabih Ali *Assistant *Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: mohmed.nabih@gmail.com mohmed.nabih@gmail.com Mobile: +201285659213
Work: 02-33318417
@Lerry123 Can you tell me how you pad the signals, and I can try to solve the Numpy issue.
Thanks
trainer.py:
for i, (mixture, clean, name) in enumerate(self.validation_data_loader): assert len(name) == 1, "Only support batch size is 1 in enhancement stage." name = name[0] padded_length = 0
mixture = mixture.to(self.device) # [1, 1, T]
clean = clean.to(self.device)
# The input of the model should be fixed length.
if mixture.size(-1) % sample_length != 0:
#print("mixture.size(-1):",mixture.size(-1))
padded_length = sample_length - (mixture.size(-1) % sample_length)
#print("padded_length:",padded_length)
mixture = torch.cat([mixture, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
clean = torch.cat([clean, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
#print("len(mixture):",len( mixture.cpu().numpy().reshape(-1)))
#print("mixture.size(-1) % sample_length:",mixture.size(-1) % sample_length)
#print("mixture.dim():",mixture.dim())
assert mixture.size(-1) % sample_length == 0 and mixture.dim() == 3
mixture_chunks = list(torch.split(mixture, sample_length, dim=-1))
#print("mixture_chunks:",mixture_chunks)
enhanced_chunks = []
for chunk in mixture_chunks:
enhanced_chunks.append(self.model(chunk).detach().cpu())
enhanced = torch.cat(enhanced_chunks, dim=-1) # [1, 1, T]
enhanced = enhanced.to(self.device)
'''
print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy()))
print("padded_length:",padded_length)
'''
enhanced = enhanced
if padded_length == 0:
enhanced = enhanced
else:
enhanced = torch.cat([enhanced, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
enhanced=enhanced[:, :, :-padded_length]
#print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy()))
enhanced = enhanced.cpu().reshape(-1).numpy()
clean = clean.cpu().numpy().reshape(-1)
mixture = mixture.cpu().numpy().reshape(-1)
Thanks
Regarding the AttributeError: module 'NumPy' has no attribute 'gcd'
It is available on Numpy version 1.15.0. So, check the NumPy version and back to me if not solved.
On Tue, Jul 21, 2020 at 3:18 AM Lerry123 notifications@github.com wrote:
trainer.py:
for i, (mixture, clean, name) in enumerate(self.validation_data_loader): assert len(name) == 1, "Only support batch size is 1 in enhancement stage." name = name[0] padded_length = 0
print("len(mixture):",len( mixture.cpu().numpy().reshape(-1)))
mixture = mixture.to(self.device) # [1, 1, T] clean = clean.to(self.device)
The input of the model should be fixed length.
if mixture.size(-1) % sample_length != 0: #print("mixture.size(-1):",mixture.size(-1)) padded_length = sample_length - (mixture.size(-1) % sample_length) #print("padded_length:",padded_length) mixture = torch.cat([mixture, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1) clean = torch.cat([clean, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1) #print("len(mixture):",len( mixture.cpu().numpy().reshape(-1))) #print("mixture.size(-1) % sample_length:",mixture.size(-1) % sample_length) #print("mixture.dim():",mixture.dim()) assert mixture.size(-1) % sample_length == 0 and mixture.dim() == 3 mixture_chunks = list(torch.split(mixture, sample_length, dim=-1)) #print("mixture_chunks:",mixture_chunks) enhanced_chunks = [] for chunk in mixture_chunks: enhanced_chunks.append(self.model(chunk).detach().cpu()) enhanced = torch.cat(enhanced_chunks, dim=-1) # [1, 1, T] enhanced = enhanced.to(self.device) ''' print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy())) print("padded_length:",padded_length) ''' enhanced = enhanced if padded_length == 0: enhanced = enhanced else: enhanced = torch.cat([enhanced, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1) enhanced=enhanced[:, :, :-padded_length] #print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy())) enhanced = enhanced.cpu().reshape(-1).numpy() clean = clean.cpu().numpy().reshape(-1) mixture = mixture.cpu().numpy().reshape(-1)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-661526116, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBGQPWLPFNFPN5V6QADR4TUEXANCNFSM4NNNL46A .
-- Mohamed Nabih Ali *Assistant *Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: mohmed.nabih@gmail.com mohmed.nabih@gmail.com Mobile: +201285659213
Work: 02-33318417
Thank you! It was solved by your suggestion.
Thanks Hope everything will be fine, can you provide me with your email to contact you for further problems
On Wednesday, July 22, 2020, Lerry123 notifications@github.com wrote:
Thank you! It was solved by your suggestion.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-662331904, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBEDD2QNFCRZMTGZSWTR42SQ5ANCNFSM4NNNL46A .
-- Mohamed Nabih Ali *Assistant *Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: mohmed.nabih@gmail.com mohmed.nabih@gmail.com Mobile: +201285659213
Work: 02-33318417
Can I get your email ? Have you done the test? My result is very confusing.
I think the problem is here: https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/blob/c8c9d8945959ba8c3aa1e7cb18cddc10dbc52210/trainer/trainer.py#L78
should be:
if padded_length != 0:
enhanced = enhanced[:,:,:-padded_length]
mixture = mixture[:,:,:-padded_length] ```
That‘s the problem,as you said. When I use VCTK database,the test result is so bad and the speech is distorted.
@Lerry123 did you try on the same dataset? And what are advantages of training on VTCK?
I am training on the same dataset, 500 epochs so far and the quality is not great. PESQ is quite low, 1.75, STIO is 0.85 and yes, the sound is distorted but I will tweak some parameters, let's see if it gives a boost.
I have set the sr=16000 in waveform_dataset.py and waveform_dataset_enhancement.py, the PESQ is 2.63.The result is best.
waveform_dataset.py:
line65: mixture, _ = librosa.load(os.path.abspath(os.path.expanduser(mixturepath)), sr=16000) line 66: clean, = librosa.load(os.path.abspath(os.path.expanduser(clean_path)), sr=16000)
@Lerry123 got it, thanks! PESQ = 2.63, Is it with VTCK?
Yes
请问你是怎么更改的参数呀,我用原论文里面与SEGAN相同的数据集,训练出来的结果声音严重失真啊,呜呜呜,怎么回事能帮忙解答一下吗,感谢@Lerry123@diff7
According to one answer this could be the solution
I think problem is here: https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/blob/c8c9d8945959ba8c3aa1e7cb18cddc10dbc52210/trainer/trainer.py#L77
should be:
if padded_length != 0: enhanced = enhanced[:,:,:-padded_length] mixture = mixture[:,:,:-padded_length]
On Sat, 1 Jan 2022 at 11:49 AM meisanhai @.***> wrote:
请问你是怎么更改的参数呀,我用原论文里面与SEGAN相同的数据集,训练出来的结果声音严重失真啊,呜呜呜,怎么回事能帮忙解答一下吗,感谢
— Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-1003540048, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBCF453AIL7GXDKZ2QDUT3L4JANCNFSM4NNNL46A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.*** com>
-- Mohamed Nabih Ali *Assistant Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: **@. @.> Mobile: +201285659213*
Work: 02-33318417
Sorry for the previous email, you could play with the parameters, and check. Thanks
On Sat, Jan 1, 2022 at 11:58 AM Mohamed Nabih @.***> wrote:
According to one answer this could be the solution
I think problem is here:
should be:
if padded_length != 0: enhanced = enhanced[:,:,:-padded_length] mixture = mixture[:,:,:-padded_length]
On Sat, 1 Jan 2022 at 11:49 AM meisanhai @.***> wrote:
请问你是怎么更改的参数呀,我用原论文里面与SEGAN相同的数据集,训练出来的结果声音严重失真啊,呜呜呜,怎么回事能帮忙解答一下吗,感谢
— Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-1003540048, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBCF453AIL7GXDKZ2QDUT3L4JANCNFSM4NNNL46A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.*** .com>
-- Mohamed Nabih Ali *Assistant Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: **@. @.> Mobile: +201285659213*
Work: 02-33318417
-- Mohamed Nabih Ali *Assistant Lecturer Faculty of Computers and IT Egyptian E-Learning University Ain Shams Center Mail: **@. @.> Mobile: +201285659213*
Work: 02-33318417
非常感谢您的回复。我用https://datashare.ed.ac.uk/handle/10283/1942这个里面的数据集,训练的结果并不好,PESQ=1.35,STOI=0.65。请问您用的是什么数据集呢@mnabihali
mnabihali Thankyou verymuch!我把waveform_dataset.py里面的采样率改为16K,效果变好了(:-|
我已经填充了清洁、增强和混合,这个问题解决了,但我有新问题。计算 STOI 时报错。详情如下。 AttributeError: module 'numpy' has no attribute 'gcd' 我没有找到解决方案,我只计算了 PESQ。 # Metric #stoi_c_n.append(compute_STOI(clean, mix, sr=16000)) #stoi_c_e.append(compute_STOI(clean, enhanced, sr=16000)) pesq_c_n.append(compute_PESQ(clean, mix, sr=16000)) pesq_c_e .append(compute_PESQ(clean, enhanced, sr=16000)) 你有同样的问题吗?
您好,请问一下是你怎么解决第十个epcho报错的问题的,我搞了好久没有解决
I think the problem is here:
should be:
if padded_length != 0: enhanced = enhanced[:,:,:-padded_length] mixture = mixture[:,:,:-padded_length] ```
thank you so much
你是中国人吗?我们可以在qq上聊天。我的英语不好。
您好,我现在研二,想复现这个代码做一个创新点。复现中遇到了一些问题,请问您方便帮我看一下吗?可以加个qq交流下吗?
when I am trying to run the code it gives me an error in this condition assert len(mixture) == len(clean) == len(enhanced) I printed the len of each and found the len of enhanced and clean is equal but len of the mixture is greater than both.
I hope you can help me as soon as possible