Closed sciai-ai closed 3 years ago
Hmm. I think it depends on your machine usage. Basically, the initial inference seems to be slow due to some CUDA initialization (maybe). How much different?
Seems like parallel-wavegan-decode
is not force-syncronnize CUDA operations. You might want to use torch.cuda. synchronize
before and after the inference. Peseudo code will look like:
torch.cuda.synchronize()
start = time.time()
with torch.no_grad():
# do inference
torch.cuda.synchronize()
elapsed_time = time.time() - start
See https://pytorch.org/docs/stable/notes/cuda.html for details.
It seems the variation is not so much for smaller input size but when its 200-300 chars the varaion can be upto 3 seconds.
Using torch.cuda.synchronize()
also makes no difference to the variation.
I have noticed another strange behaviour for timing, when i want to write the wav array to output wav file. Below code snippet increases the code run time by 3x compared to if we do not add the scipy write code. I tested for 1000 chars input. Can you please check? I used LJSpeech model from the espnetttsnotebook, also tried sf.write but same results
# write wav file
start = time.time()
import scipy.io.wavfile
# synthesis
with torch.no_grad():
start2 = time.time()
wav, c, *_ = text2speech(x)
wav = vocoder.inference(c)
print((time.time() - start2))
wav2 = wav.view(-1).cpu().numpy()
scipy.io.wavfile.write('123.wav', 22050, wav2)
print((time.time() - start))
When i just execute the code separately the runtime is very quick
Maybe better to check the time without io.
haha I know why, its not the io, it's the time taken to copy the tensor from GPU to CPU 😊
Hi @kan-bayashi
I have noticed that the RTF varies even when we use manual seeds for both taco2 and pwg. I am wondering where is this randomness coming from which makes the calculation of tensors slower or faster? #291