NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
12.02k stars 2.5k forks source link

Superficial KALDI_ROOT and torchaudio warnings when importing nemo.collections.nlp in NGC container #998

Closed supertetelman closed 3 years ago

supertetelman commented 4 years ago

I am using the latest NeMO image from NGC:

docker run -it nvcr.io/nvidia/nemo:v0.11

I am importing the nlp collections module by running:

python -c "import nemo.collections.nlp as nemo_nlp"

I get the below warning messages:

################################################################################
### WARNING, path does not exist: KALDI_ROOT=/mnt/matylda5/iveselyk/Tools/kaldi-trunk
###          (please add 'export KALDI_ROOT=<your_path>' in your $HOME/.profile)
###          (or run as: KALDI_ROOT=<your_path> python <your_script>.py)
################################################################################

[NeMo W 2020-08-03 20:35:22 audio_preprocessing:56] Could not import torchaudio. Some features might not work.

I was able to get the KALDI warnings to go away by adding the following to my Dockerfile:

FROM nvcr.io/nvidia/nemo:v0.11

# Set this to suppress warnings due to NeMO initialization issue
ENV KALDI_ROOT /workspace

Attempting to pip install torchaudio just resulted in a Segmentation fault, so I'll have to live with that warning.

This is a superficial issue, but it would be nice if we could at least set the KALDI_ROOT to something that exists by default in the NeMO framework or within the NGC container. I have seen these warnings cause confusion for several people going through provided example notebooks.

viveksj commented 4 years ago

Getting the same error

jaganadhg commented 4 years ago

Compiling and configuring Kaladi (http://kaldi-asr.org/) resolves this issue. I think this should be part of the install or development documentation. (Expecially in tutorial notebooks.