Le-Xiaohuai-speech / DPCRN_DNS3

Implementation of paper "DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement"
188 stars 41 forks source link

Real-time inference without lstm state #17

Open plutols opened 2 years ago

plutols commented 2 years ago

when you infer in real time, the current and next initial_state of lstm defaults to None. so each frame is independent when we infer ?

Le-Xiaohuai-speech commented 2 years ago

We pass the state of the last step to the current step.

------------------ 原始邮件 ------------------ 发件人: "Le-Xiaohuai-speech/DPCRN_DNS3" @.>; 发送时间: 2022年4月7日(星期四) 中午11:28 @.>; @.***>; 主题: [Le-Xiaohuai-speech/DPCRN_DNS3] Real-time inference without lstm state (Issue #17)

when you infer in real time, the current and next initial_state of lstm defaults to None. so each frame is independent when we infer ?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

plutols commented 2 years ago

for i in range(L):

   output_data = sess.run(session_list[1],feed_dict = {model.input:input_spec_data[:,i*blockshift:i*blockshift+blockLen]})[0]
    RT_output.append(output_data)

it seems the feed_dict just has frame value and not state of lstm

Le-Xiaohuai-speech commented 2 years ago

In fact, there is a update operator in the session_list[1]. The "upop" holds the states and pass them to the next step.

------------------ 原始邮件 ------------------ 发件人: "Le-Xiaohuai-speech/DPCRN_DNS3" @.>; 发送时间: 2022年4月7日(星期四) 中午11:38 @.>; @.**@.>; 主题: Re: [Le-Xiaohuai-speech/DPCRN_DNS3] Real-time inference without lstm state (Issue #17)

for i in range(L): output_data = sess.run(session_list[1],feed_dict = {model.input:input_spec_data[:,iblockshift:iblockshift+blockLen]})[0] RT_output.append(output_data)
it seems the feed_dict just has frame value and not state of lstm

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

plutols commented 2 years ago

oh I see, but why just the inter_rnn return the state and the intra_rnn not return the state

Le-Xiaohuai-speech commented 2 years ago

the intra_rnns work independently at different time step

plutols commented 2 years ago

session_list[1][1] can not be write in pb, I want to infer in pb. how should I do

Le-Xiaohuai-speech commented 2 years ago

the assign operator can not be writed in pb. you can build a model of a single time step. 

---Original--- From: @.> Date: Fri, Apr 8, 2022 14:18 PM To: @.>; Cc: @.**@.>; Subject: Re: [Le-Xiaohuai-speech/DPCRN_DNS3] Real-time inference without lstmstate (Issue #17)

session_list[1][1] can not be write in pb, I want to infer in pb. how should I do

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

plutols commented 2 years ago

I already get the model pb, when I infer in pb, what I should do, so I can get the same wav with your inference

Le-Xiaohuai-speech commented 2 years ago

Could you show your code? I am not sure if the graph is recorded correctly.

------------------ 原始邮件 ------------------ 发件人: "Le-Xiaohuai-speech/DPCRN_DNS3" @.>; 发送时间: 2022年4月8日(星期五) 下午2:43 @.>; @.**@.>; 主题: Re: [Le-Xiaohuai-speech/DPCRN_DNS3] Real-time inference without lstm state (Issue #17)

I already get the model pb, when I infer in pb, what I should do, so I can get the same wav with your inference

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

plutols commented 2 years ago
for i in range(L):

    output_data = sess.run(session_list[1][0], feed_dict={
         model.input: input_spec_data[:, i * blockshift:i * blockshift + blockLen]})

    RT_output.append(output_data)

the graph is ok , I infer in my pb, it can get the same wav with your real_time_DPCRN.py, but I use session_lit[1][0] not session[1]

plutols commented 2 years ago

I want to get the same wav with your real_time_DPCRN.py and use session[1] not session_lit[1][0]

Le-Xiaohuai-speech commented 2 years ago

You mean the update operator (session_lit[1][1]) doesnot work?

plutols commented 2 years ago

I solve this problem, but I have a new problem, when I transform my model to tflite, there is a error(bad padding only SAME and VALID are supported). Have you this problem

Le-Xiaohuai-speech commented 2 years ago

Use the tf.pad instead of the padding parameter in the Conv2d.

plutols commented 2 years ago

it works, thanks

plutols commented 2 years ago

In your tflite, why the outputs have the sin and cos

panhu commented 2 years ago

I solve this problem, but I have a new problem, when I transform my model to tflite, there is a error(bad padding only SAME and VALID are supported). Have you this problem

Hi,can you tell me how to transform model to tflite? i must save my model to pb? if my model is h5

Le-Xiaohuai-speech commented 2 years ago

use tf.pad or keras.layers.ZeroPadding2D instead

------------------ 原始邮件 ------------------ 发件人: "Le-Xiaohuai-speech/DPCRN_DNS3" @.>; 发送时间: 2022年8月15日(星期一) 下午3:11 @.>; @.**@.>; 主题: Re: [Le-Xiaohuai-speech/DPCRN_DNS3] Real-time inference without lstm state (Issue #17)

I solve this problem, but I have a new problem, when I transform my model to tflite, there is a error(bad padding only SAME and VALID are supported). Have you this problem

Hi,can you tell me how to transform model to tflite? i must save my model to pb? if my model is h5

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

panhu commented 2 years ago

Hi, Do I just need to save the model after training as pb and then use tf.pad or keras.layers.ZeroPadding2D instead? Do I need to modify the lambda layer?

Le-Xiaohuai-speech commented 2 years ago

fft and ifft are not supported in tflite. convert the network (from the first conv to the last deconv) only

---Original--- From: "About @.> Date: Mon, Aug 15, 2022 15:49 PM To: @.>; Cc: @.**@.>; Subject: Re: [Le-Xiaohuai-speech/DPCRN_DNS3] Real-time inference without lstmstate (Issue #17)

use tf.pad or keras.layers.ZeroPadding2D instead … ------------------ 原始邮件 ------------------ 发件人: "Le-Xiaohuai-speech/DPCRN_DNS3" @.>; 发送时间: 2022年8月15日(星期一) 下午3:11 @.>; @.@.>; 主题: Re: [Le-Xiaohuai-speech/DPCRN_DNS3] Real-time inference without lstm state (Issue #17) I solve this problem, but I have a new problem, when I transform my model to tflite, there is a error(bad padding only SAME and VALID are supported). Have you this problem Hi,can you tell me how to transform model to tflite? i must save my model to pb? if my model is h5 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Thanks, i use this method transform my model: modelparh = r"models_experiment_5/experiment_5model01-4.547007.h5"

model = tf.keras.models.load_model(modelparh,custom_objects={"DprnnBlock":DprnnBlock,"StLa":StLa,"MK_M":MK_M,"ifft_Layer":ifft_Layer,"Overlap_addLayer":Overlap_addLayer}) converter = tf.lite.TFLiteConverter.from_keras_model_file(model) tflite_model = converter.convert() savepath = r"weights_c1_d.tflite" open(savepath, "wb").write(tflite_model)
But,i can not successful transform. can i know you method? Tanks

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>