Closed nanometer34688 closed 4 years ago
Hi, actually not. If you read the help of 'audio_processing' subcommand you can find the description of '--max_audio_length' option:
Set this value to the maximum length (in samples with desidered sample rate) of single wav
You can easily modify the code to take shorter segments for each sample.
Giovanni
That's the option I am having trouble with. If an audio file has a length longer than 42000, i get that error.
This is the command I run:
python3 av_speech_enhancement.py audio_preprocessing --data_dir zipped_data/TEST_SET --speaker_ids 2 8 --dest_dir . --audio_dir . -ml 42000
And this is the error i get:
audio_samples[i, n_fft//2: len(samples) + n_fft//2] = samples ValueError: could not broadcast input array from shape (42240) into shape (42000)
Is this something you have seen before?
On the GRID Copus dataset, did you use the default value of 48000 for your max_audio_length option?
@dr-pato I seem to have found the issue.
audio_features.py line 39 is:
audio_samples[i, n_fft//2: len(samples) + n_fft//2] = samples
But it causes an error as samples can be larger than audio samples.
The fix i made was as follows:
audio_samples[i, n_fft//2: len(samples) + n_fft//2] = samples[:max_audio_length]
Now my audio files have been cut and have now fixed this issue.
Thank you
I have found an issue while using the GRID dataset. The audio/video files range in length. Using the audio_preprocessing script, the option 'max_wav_length' seems to fail when wanting to set the desired length of the audio.
So for example if i run the option with 42000 as the max wave length, I get this error:
I'm assuming that it should cut the audio up to 42000 samples? Am i correct in thinking this?