sh-lee-prml / HierSpeechpp

The official implementation of HierSpeech++
MIT License
1.19k stars 135 forks source link

NaN in the text encoder value #27

Closed Pranjalya closed 10 months ago

Pranjalya commented 10 months ago

In the output of the text encoder on a custom dataset, here https://github.com/sh-lee-prml/HierSpeechpp/blob/baeaf74c111ac5fcc088744b14bad8f5c8301c93/ttv_v1/t2w2v_transformer.py#L393 the value of x is coming as NaN. Due to it, several other parameters were coming as 0 values and NaN as well. Do you have any clues on what could be causing the issue?

sh-lee-prml commented 10 months ago

I have not experienced Nan value in Text encoder.

I think that some issues may occur when using very short text...

Actually during training, we filtered some data as https://github.com/sh-lee-prml/HierSpeechpp/issues/20#issuecomment-1870806287.

we removed some data with short text and audio before training the model.

Thanks

Pranjalya commented 10 months ago

Thanks, it got resolved via that. I think the main filtering point which was causing this were the files which had matched this:

if len_txt*2+1 > len_data: