anicolson / DeepXi

Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
Mozilla Public License 2.0
497 stars 127 forks source link

Can you share the ResLSTM codes? #27

Closed 20050710212 closed 4 years ago

20050710212 commented 4 years ago

Thank you for such a fantastic work.

I am also interested in the ResLSTM and want to train it myself. Could you share the codes with me? I' d appreciate any help. By the way, what is the RDLNet? Is it a great progress in WER? When will it be released?

Thank you!

anicolson commented 4 years ago

ResLSTM has been added. I have not tested it with the current code, so please let me know of any issues.

RDLNet can be found here: https://arxiv.org/abs/2002.12794. This will be made available within the next couple of weeks hopefully. I will post some speech enhancement results for it as well to compare it to ResNet.

20050710212 commented 4 years ago

Thanks for the codes and the paper and I will read them soon. Furthermore, I am looking forward to your new work.

20050710212 commented 4 years ago

@anicolson Hi Aaron, I find that the training process of ResLSTM seems very very slow, compared to ResNet. For example, it takes about 20 s/iter for ResLSTM, but 4 iter/s for ResNet.

anicolson commented 4 years ago

This is why I switched to ResNet. Training an LSTM architecture is very time-consuming.

20050710212 commented 4 years ago

Thank you. I use a RTX2080. Is it reasonable that the train speed of ResNet is 80 times faster than that of ResLSTM?

anicolson commented 4 years ago

Yes

20050710212 commented 4 years ago

Thanks a lot.

20050710212 commented 4 years ago

@anicolson Hi, Aaron, thanks for the sharing of the RDLnet peper. In the experiment section, the configuration of the ResNet is different from the DeepXi. It says,

Each residual block contained 2 causal dilated convolutional units with an output size of 64, and a kernel size of 3. For each block, d was cycled from 1 to 8 (increasing by a power of 2). ResNets of sizes 0.53, 1.03, 1.53, and 2.03 million parameters were formed by cascading 20, 40, 60, and 80 residual locks, respectively.

  1. Is it more efficient by using this structure as described in the paper?
  2. Is the structure I drawed right? Thank you so much. image
anicolson commented 4 years ago

Please see this for the resnet: https://ieeexplore.ieee.org/document/9066933

It utilises bottleneck residual blocks, which are more parameter efficient than normal blocks (from RDLNet paper). It is the same one available on the deep xi repository.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Li Kang notifications@github.com Sent: Monday, April 27, 2020 5:42:45 PM To: anicolson/DeepXi DeepXi@noreply.github.com Cc: Aaron Nicolson aaron.nicolson@griffithuni.edu.au; Mention mention@noreply.github.com Subject: Re: [anicolson/DeepXi] Can you share the ResLSTM codes? (#27)

Reopened #27https://github.com/anicolson/DeepXi/issues/27.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/anicolson/DeepXi/issues/27#event-3274419936, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGHGZ7WOYZURZ5UFGXVKG4DROUZPLANCNFSM4LOD2RKQ.

20050710212 commented 4 years ago

Thank you so much.