Open A-Telfer opened 1 year ago
could you solve the issue?
I think it was a typo in the docs and we were supposed to run the similarly named .py file, but there are some other errors with that. Still debugging
I've had a few issues getting the dataset built as well. This is what I have had to do so far:
noisyspeech_synthesizer.py
file these lines since the default download script extracts the raw files to /microsoft_dns/datasets_fullband/datasets_fullband/
.python noisyspeech_synthesizer.py -root ./
1.24.3
to 1.23.5
, and downgrade my librosa version from 0.10.0
to 0.8.1
. Exactly the same here, except rather than rename the path I just moved the raw files up a level
I started running 4
on my laptop and gave up waiting after 60,000+ or so since there was no progress bar (code uses while loops so not immediately clear how long it would take) and started thinking it might not work on a partial dataset download.
As an update, the synthesizer takes quite a while depending on your machine. I'm using a AMD Ryzen Threadripper processor with the raw files loaded onto a native SSD, it took about ~115 hrs just to generate the training_set. Seems like the validation set will take roughly the same.
@A-Telfer I'mnot sure what you mean by "no progress bar," as I periodically saw output from the synthesizer script indicating when it had to retry synthesizing some of the audio files:
Number of files to be synthesized: 60000
Start idx: 0
Stop idx: 59999
Generating synthesized data in ./
Warning: File #5 has unexpected clipping, returning without writing audio to disk
Warning: File #29 has unexpected clipping, returning without writing audio to disk
...
Warning: File #1114 has unexpected clipping, returning without writing audio to disk
Warning: File #1130 has unexpected clipping, returning without writing audio to disk
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Warning: File #1151 has unexpected clipping, returning without writing audio to disk
Warning: File #1164 has unexpected clipping, returning without writing audio to disk
...
Warning: File #29071 has unexpected clipping, returning without writing audio to disk
Warning: File #29090 has unexpected clipping, returning without writing audio to disk
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Warning: File #29107 has unexpected clipping, returning without writing audio to disk
Warning: File #29118 has unexpected clipping, returning without writing audio to disk
...
Warning: File #34891 has unexpected clipping, returning without writing audio to disk
Warning: File #34891 has unexpected clipping, returning without writing audio to disk
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Warning: File #34919 has unexpected clipping, returning without writing audio to disk
Warning: File #34925 has unexpected clipping, returning without writing audio to disk
...
Warning: File #44648 has unexpected clipping, returning without writing audio to disk
Warning: File #44661 has unexpected clipping, returning without writing audio to disk
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Warning: File #44699 has unexpected clipping, returning without writing audio to disk
Warning: File #44711 has unexpected clipping, returning without writing audio to disk
...
Warning: File #59935 has unexpected clipping, returning without writing audio to disk
Warning: File #59962 has unexpected clipping, returning without writing audio to disk
Warning: File #59989 has unexpected clipping, returning without writing audio to disk
Warning: File #59991 has unexpected clipping, returning without writing audio to disk
Warning: File #59997 has unexpected clipping, returning without writing audio to disk
Of the 466391 clean speech files analyzed, 2.6% had clipping, and 46.8% had low activity (below 60.0% active percentage)
Of the 221062 noise files analyzed, 18.4% had clipping, and 0.0% had low activity (below 0.0% active percentage)
I believe that it should work on a partial dataset, based on the information given during the orientation. Is this not the output you saw @A-Telfer?
For me, the training_set completed with the following properties: 180,003 items, totalling 172.8 GB. Not entirely sure if this is the expected output of the synth script.
Hi all, I have fixed the typo in the readme. As you already noted, it should have been noisyspeech_synthesizer.py
, not .cfg
Assuming you have downloaded your dataset in ./data/datasets_fullband/
, the commands to execute are
python noisyspeech_synthesizer.py -root ./data/datasets_fullband/
python noisyspeech_synthesizer.py -root ./data/datasets_fullband/ -is_validation_set true
The synthesis does take a lot of time, and there is no progress bar in the script. A way to monitor the progress is:
ls -l data/datasets_fullband/training_set/clean/*.wav | wc -l
ls -l data/datasets_fullband/validation_set/clean/*.wav | wc -l
These should print out the number of samples generated. It will give you the number of samples generated in training and validation set respectively. @BujSet 180,003 items looks correct. It's 60k audio samples for clean, noise, and noisy.
I've had a few issues getting the dataset built as well. This is what I have had to do so far:
- I modified the origin points in the
noisyspeech_synthesizer.py
file these lines since the default download script extracts the raw files to/microsoft_dns/datasets_fullband/datasets_fullband/
.- I have multiple versions of python on my environment, so for me the the command to be run is
python noisyspeech_synthesizer.py -root ./
- However, when I first ran that command, I got many versioning errors. After a little digging, it seems that the latest version of librosa is not compatible with the latest version of numpy. I had to downgrade my numpy version from
1.24.3
to1.23.5
, and downgrade my librosa version from0.10.0
to0.8.1
.- After that, running the command in step 2 generates the training and validation files (I think). This is currently in progress for me, but I'll followup if this completes without error.
I have librosa v0.10.0 and numpy v1.23.5 and it worked, but in microsoft_dns/noisyspeech_synthetizer_singleprocess.py
line 90 I had to change librosa.resample(arg1, arg2, arg3)
to librosa.resample(input_audio, orig_sr=fs_input, target_sr=fs_output)
.
@daevem thanks for putting up this information. Perhaps the librosa
interface has changed at some point. A working combination we have for the current version of code is with
librosa==0.9.2
numpy==1.23.3
Running the dataset synthesis step
... result in this error