NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.84k stars 2.46k forks source link

GPU memory is used although cpu() is called #3289

Closed awsomecod closed 2 years ago

awsomecod commented 2 years ago

I run the following commands:

import nemo
import nemo.collections.asr as nemo_asr
import nemo.collections.nlp as nemo_nlp
import nemo.collections.tts as nemo_tts

quartznet = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name="stt_en_quartznet15x5").cpu()
English_text = quartznet.transcribe(['~/audio1.wav'])

In the output of nvidia-smi, I see that 800 Mb of GPU is used when running the above commands. Why? My expectation is that no GPU memory should be used when I use cpu().

titu1994 commented 2 years ago

By default, from_pretrained and restore_from will restore and place the model on the GPU if no "map_location" is provided. So first it goes onto the GPU then moved to the CPU with .cpu(). You will need to use map_location="cpu" in from_pretrained.

awsomecod commented 2 years ago

That fixed the issue.