Closed welliX closed 4 years ago
It should be possible - you would need to remove any explicit conversions to CUDA. There might also be other changes necessary, I think you also need to disable CUDNN.
Many thanks, I tried, however without success - is there maybe a step-by-step procedure how to go on? Did nobody try this before?
hi @welliX , are you getting any errors when turning the code to CPU-only?
Hi @GrzegorzKarchNV, thanks for coming back!
final part of the Traceback for inference.py is:
File "/mnt/crypton/home/kiessl/anaconda3/lib/python3.7/subprocess.py", line 488, in run with Popen(*popenargs, **kwargs) as process: File "/mnt/crypton/home/kiessl/anaconda3/lib/python3.7/subprocess.py", line 800, in __init__ restore_signals, start_new_session) File "/mnt/crypton/home/kiessl/anaconda3/lib/python3.7/subprocess.py", line 1551, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'nvidia-smi': 'nvidia-smi'
as said I want to try a CPU only inference - and do not have NVidia GPU.
would be great if you could tell my how to prevent the usage of 'nvidia-smi'/cuda/...
In inference.py (and in models.py as well) I replaced cuda by cpu - here is the diff (RCS format)
diff -n inference.py.org inference.py.new d119 1 a119 1 model = models.get_model(model_name, model_config, to_cuda=False, rename=rename) d122 1 a122 1 state_dict = torch.load(checkpoint, map_location='cpu')['state_dict'] d164 3 a166 3 if torch.cpu.is_available(): text_padded = torch.autograd.Variable(text_padded).cpu().long() input_lengths = torch.autograd.Variable(input_lengths).cpu().long() d180 1 a180 1 torch.cpu.synchronize() d184 1 a184 1 torch.cpu.synchronize() d218 1 a218 1 denoiser = Denoiser(waveglow).cpu() d234 2 a235 2 dtype=torch.long).cpu() input_lengths = torch.IntTensor([sequence.size(1)]).cpu().long()
thanks for any hint!
you need to comment out log_hardware() in https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/SpeechSynthesis/Tacotron2/inference.py#L211 this function calls nvidia-smi to log GPU info
Cool, man thanks, one step further: inference.py is starting! However, now an error is thrown due to shape mismatch (by a factor of 4, see below). The checkpoint models for are created (and written) by train.py however ran on a different machine using GPU/Coda. Can this be the reason for the mismatch and is there a means to transform into the right format? I already tried some other parameters like with and w/o --amp-run - without success. many thanks in advance!
:::NVLOGv0.2.2 Tacotron2_PyT 1576077661.701038122 (/mnt/allhome/TMP/work/WaveGlow/NVIDIA_DeepLearningExamples_PyTorch_SpeechSynthesis_Tacotron2/DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/dllogger/logger.py:251) args: {"input": txt", "output": "output/", "tacotron2": "/TMP/work/WaveGlow/PretrainedModels/OwnTrainings/checkpoint_Tacotron2_350", "waveglow": "/TMP/work/WaveGlow/PretrainedModels/OwnTrainings/checkpoint_WaveGlow_1000", "sigma_infer": 0.9, "denoising_stsampling_rate": 22050, "amp_run": true, "log_file": "nvlog.json", "include_warmup": false, "stft_hop_length": 256}
Traceback (most recent call last):
File "inference.py", line 275, in
any idea? May it be that the reason for the format mismatch is because the training (train.py) is conducted on GPU/Coda whereas the inference.py is conducted on CPU? Is there a means to transform the models into the right format?
Got it!! with the models JoC_Tacotron2_FP16_PyT_20190306 and JoC_WaveGlow_FP16_PyT_20190306 I could make it run on my CPU-only laptop: python inference.py --tacotron2 $tacotronCP --waveglow $waveglowCP -o output/ -i phrases/phrase.txt
besides inference.py also waveglow/denoiser.py and tacotron2/model.py needed cuda=>cpu adaptation.
@GrzegorzKarchNV - many thanks for your support!
Hi @welliX. Can you reveal performances when inferencing with CPU. Does it real time inferences?
Can you reveal performances when inferencing with CPU. Does it real time inferences? acoustic quality is quite ok.
real time? of course not. on my machine (Asus laptop with 4 CPUs - Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, x86_64) for 1 sec TTS speech about 18 sec user time is needed.
Is it possible to run inference.py on CPU-only device? If yes, what steps are to be done in detail? Think it's valuable to test inference (for pre-trained models) on CPU-only if no GPU is available.