Closed Arafat4341 closed 3 years ago
I think it makes sense that you supply the absolute path to your mp3
file not a relative path. Supplying an absolute path helps you to make sure that the program is searching in the correct directory.
Path to model should be the path to you trained language identification model.
@Bartzi But I am getting error even after supplying the absolute path of mp3 file.
I trained the topcoder_crnn_finetune.py
model. And I added model.load_weights("absolute/path/to/my/trained/weight", by_name=True)
to the model module.
I gave the path models/topcoder_crnn_finetune.py
in running predict.py
.
Did I do everything correctly?
I am using google colab for training. I mounted drive on colab. The read and write operation happens on the drive.
SO the absolute path of my audio is:
/content/drive/My\ Drive/crnn-lid/keras/audios/speech.mp3
I provided this path. Still getting:
('SpectrogramGenerator Exception: ', IOError(2, 'No such file or directory'), 'audios/speech.mp3')
@Bartzi Hello!
I am just getting this error:
('SpectrogramGenerator Exception: ', IOError(2, 'No such file or directory'), 'audios/speech.mp3')
But I have a directory inside keras/
named audio/
and I have placed the audio file speech.mp3
there. But still getting this error!
Here is my command line:
python predict.py --model models/logs/2020-07-29-05-05-31/weights.12.model --input audios/speech.mp3
Do you have any idea why I am getting this?! Thanks!
I saw only 10 sec audio clip for testing works. After delivering a 10 sec audio I avoided the error. Was it supposed to happen?
Hi,
sry for not coming back to you earlier. Did you solve your problems now?
I saw only 10 sec audio clip for testing works.
Yes, that is correct for training a model with default settings. You can, however, train another model by setting the semgent length (https://github.com/HPI-DeepLearning/crnn-lid/blob/master/keras/config.yaml#L15).
Thanks for your response! It's a pleasure! @Bartzi Actually I trained with 10 sec audio files. But I am talking about testing the trained model with new audio.
If you only train on 10 sec audio files, the resulting model will also only work with 10 second snippets :shrug:. If you want to use different audio lengths, you'll have to train new models.
Ah... I see! Thanks a lot!
@Bartzi Don't I need to change anything else in data preparation stage? I mean to cut them in 3 second chunks?
Of course, you also need to create spectograms according to the audio length you want to use.
@Bartzi Thanks a lot.
I have made changes in download_youtube.py
in the line 67:
command = ["ffmpeg", "-y", "-i", f, "-map", "0", "-ac", "1", "-ar", "16000", "-f", "segment", "-segment_time", "3", output_filename]
I have set the segment_time to 3. And also in config.yaml
. Is that all?
@Bartzi I am failing to create mel-spectrograms for 3 sec long audios. The images are not generated. Is there any specific input shape, pixel per second for 3 sec long audios?
I think so :sweat_smile: if you look at the default values you can see that if we set pixels_per_second
to 50
and have audio snippets of 10
seconds, we supply a width of 500
, since 50 * 10 = 500
.
Ah! Thanks a lot! I changed the width to 150 now!
I am trying to predict. I have specified the audio path correctly but still I am getting error:
ValueError: need at least one array to stack
full error:
Does input audio needs to be of exact 10 secconds? another question:
python predict.py --model <path/to/model> --input <path/to/speech.mp3>
Here, what should be the path to model ?