Closed bsshruthi22 closed 2 months ago
is there anyway that I can optimize it
Yes, you can start many terminals and each uses a different --start --stop
. For instance
# Terminal 1
--start 0 --stop 100 --num-splits 240043
# Terminal 2
--start 100 --stop 200 --num-splits 240043
Since extraction is going on,if I stop it ,is there a way to resume it to extract only which are not done.
Yes, the script will skip files that are already extracted.
1)Does it take this long?
It depends on the I/O of your disk and also the format of your data.
If it is .wav
, then it should not take so long for only 5k hours of data.
We are using .wav files
It's been 2 and half days 160000 splits feature extraction is done
I am doing feature extraction for around 5000 hours of data.I am using giga speech code. my train splits are 240043.I am using a system with below config cores- 24 ram- 130gb gpu - NVIDIA RTX A6000 GPU with 48gb memory. feature extraction is going on from 2 days. 1)Does it take this long? is there anyway that I can optimize it? If yes, can i do that in between extraction. 2)Since extraction is going on,if I stop it ,is there a way to resume it to extract only which are not done. Thanks in advance