mravanelli / SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.
MIT License
1.14k stars 263 forks source link

trained model error #32

Closed taylorlu closed 5 years ago

taylorlu commented 5 years ago

Hi, the trained model seems not work when compute_d_vector: Missing key(s) in state_dict: "conv.0.low_hz_", "conv.0.band_hz_" the keys in checkpoint_load['CNN_model_par']:

conv.0.filt_b1
conv.0.filt_band
conv.1.weight
conv.1.bias
conv.2.weight
conv.2.bias
bn.0.weight
bn.0.bias
bn.0.running_mean
bn.0.running_var
bn.1.weight
bn.1.bias
bn.1.running_mean
bn.1.running_var
bn.2.weight
bn.2.bias
bn.2.running_mean
bn.2.running_var
ln.0.gamma
ln.0.beta
ln.1.gamma
ln.1.beta
ln.2.gamma
ln.2.beta
ln0.gamma
ln0.beta

Thanks a lot.

mravanelli commented 5 years ago

OK, thank you for pointing it. It seems a model computed with the previous version of SincNet. Let me update it as soon as possible!

Mirco

On Fri, 12 Apr 2019 at 01:38, donglu notifications@github.com wrote:

Hi, the trained model https://bitbucket.org/mravanelli/sincnet_models/ seems not work when compute_d_vector: Missing key(s) in state_dict: "conv.0.lowhz", "conv.0.bandhz" the keys in checkpoint_load['CNN_model_par']:

conv.0.filt_b1 conv.0.filt_band conv.1.weight conv.1.bias conv.2.weight conv.2.bias bn.0.weight bn.0.bias bn.0.running_mean bn.0.running_var bn.1.weight bn.1.bias bn.1.running_mean bn.1.running_var bn.2.weight bn.2.bias bn.2.running_mean bn.2.running_var ln.0.gamma ln.0.beta ln.1.gamma ln.1.beta ln.2.gamma ln.2.beta ln0.gamma ln0.beta

Thanks a lot.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mravanelli/SincNet/issues/32, or mute the thread https://github.com/notifications/unsubscribe-auth/AQGs1gVae7Anlvfycd3HFGjvU6ecDOIEks5vgBvygaJpZM4crXT2 .

mravanelli commented 5 years ago

Hi, could you try with the new model in the external repository and with the new compute_d_vector code?

taylorlu commented 5 years ago

@mravanelli Thanks very much, it works well now, the d-vector is in 2048 dim. For open-set speaker verification, I think the model can use a more discriminative loss function such as arcface/amsoftmax insteads of softmax on the top of CNN/DNN layers.

mravanelli commented 5 years ago

Sure, you can try it!

On Apr 13, 2019 21:56, "donglu" notifications@github.com wrote:

@mravanelli https://github.com/mravanelli Thanks very much, it works well now, the d-vector is in 2048 dim. For open-set speaker verification, I think the model can use a more discriminative loss function such as arcface/amsoftmax insteads of softmax on the top of CNN/DNN layers.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mravanelli/SincNet/issues/32#issuecomment-482911527, or mute the thread https://github.com/notifications/unsubscribe-auth/AQGs1gHwPBDK_nw0WxCPvvJRXZjWP3Rmks5vgorngaJpZM4crXT2 .