Import typo in auto_asr.py?

Hi, i'm sorry, this library looks amazing, I am just trying to run this example from the documentation. The traceback indicates to me it is looking in plugins/audio for a module riva_asr.py which is in plugins/speech. Could this just be as simple as a typo? in the code?

jetson-containers run --env HUGGINGFACE_TOKEN=---------------------------------------- $(autotag nano_llm) python3 -m nano_llm.agents.web_chat --api=mlc --model meta-llama/Meta-Llama-3-8B-Instruct --asr=riva --tts=piper Namespace(packages=['nano_llm'], prefer=['local', 'registry', 'build'], disable=[''], user='dustynv', output='/tmp/autotag', quiet=False, verbose=False) -- L4T_VERSION=36.3.0 JETPACK_VERSION=6.0 CUDA_VERSION=12.2 -- Finding compatible container image for ['nano_llm'] dustynv/nano_llm:r36.2.0 [sudo] password for nvidia16: Sorry, try again. [sudo] password for nvidia16: localuser:root being added to access control list

docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/nvidia16/Desktop/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-7 --device /dev/i2c-9 -v /run/jtop.sock:/run/jtop.sock --env HUGGINGFACE_TOKEN=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 dustynv/nano_llm:r36.2.0 python3 -m nano_llm.agents.web_chat --api=mlc --model meta-llama/Meta-Llama-3-8B-Instruct --asr=riva --tts=piper /usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead. warnings.warn( The token has not been saved to the git credentials helper. Pass add_to_git_credential=True in this function directly or --add-to-git-credential if using via huggingface-cli if you want to set the git credential as well. Token is valid (permission: read). Your token has been saved to /data/models/huggingface/token Login successful /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn( Fetching 13 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 9119.58it/s] Fetching 17 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 5235.18it/s] 18:19:49 | INFO | loading /data/models/huggingface/models--meta-llama--Meta-Llama-3-8B-Instruct/snapshots/e1945c40cd546c78e41f1151f4db032b271faeaa with MLC Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 18:19:52 | INFO | device=cuda(0), name=Orin, compute=8.7, max_clocks=918000, multiprocessors=8, max_thread_dims=[1024, 1024, 64], api_version=12020, driver_version=None 18:19:52 | INFO | loading Meta-Llama-3-8B-Instruct from /data/models/mlc/dist/Meta-Llama-3-8B-Instruct-ctx8192/Meta-Llama-3-8B-Instruct-q4f16_ft/Meta-Llama-3-8B-Instruct-q4f16_ft-cuda.so 18:19:52 | WARNING | model library /data/models/mlc/dist/Meta-Llama-3-8B-Instruct-ctx8192/Meta-Llama-3-8B-Instruct-q4f16_ft/Meta-Llama-3-8B-Instruct-q4f16_ft-cuda.so was missing metadata ┌─────────────────────────┬─────────────────────────────────────────────────────────────────────────────┐ │ architectures │ ['LlamaForCausalLM'] │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ attention_bias │ False │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ attention_dropout │ 0.0 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ bos_token_id │ 128000 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ eos_token_id │ 128009 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ hidden_act │ silu │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ hidden_size │ 4096 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ initializer_range │ 0.02 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ intermediate_size │ 14336 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ max_position_embeddings │ 8192 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ model_type │ llama │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ num_attention_heads │ 32 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ num_hidden_layers │ 32 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ num_key_value_heads │ 8 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ pretraining_tp │ 1 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ rms_norm_eps │ 1e-05 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ rope_scaling │ │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ rope_theta │ 500000.0 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ tie_word_embeddings │ False │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ torch_dtype │ bfloat16 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ transformers_version │ 4.40.0.dev0 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ use_cache │ True │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ vocab_size │ 128256 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ name │ Meta-Llama-3-8B-Instruct │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ api │ mlc │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ mm_projector_path │ /data/models/huggingface/models--meta-llama--Meta-Llama-3-8B-Instruct/snaps │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ quant │ q4f16_ft │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ type │ llama │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ max_length │ 8192 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ prefill_chunk_size │ -1 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ load_time │ 10.725436296997941 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ params_size │ 3895.7578125 │ └─────────────────────────┴─────────────────────────────────────────────────────────────────────────────┘

18:20:00 | INFO | using chat template 'llama-3' for model Meta-Llama-3-8B-Instruct 18:20:00 | INFO | model 'Meta-Llama-3-8B-Instruct', chat template 'llama-3' stop tokens: ['<|end_of_text|>', '<|eot_id|>'] -> [128001, 128009] 18:20:00 | INFO | Warming up LLM with query 'What is 2+2?' 18:20:02 | INFO | Warmup response: 'Easy peasy!\n\nThe answer to 2+2 is... 4!<|eot_id|>' Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 327, in agent = WebChat(vars(args)) File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 32, in init super().init(kwargs) File "/opt/NanoLLM/nano_llm/agents/voice_chat.py", line 38, in init self.asr = AutoASR.from_pretrained(asr=asr, **kwargs) File "/opt/NanoLLM/nano_llm/plugins/speech/auto_asr.py", line 34, in from_pretrained from nano_llm.plugins.audio.riva_asr import RivaASR ModuleNotFoundError: No module named 'nano_llm.plugins.audio.riva_asr'

Yes, sorry about that - it is fixed in main branch, but please see https://github.com/dusty-nv/NanoLLM/issues/20#issuecomment-2181285298 for workaround to use 24.6 release of nano_llm container for now 👍

From: lukecdash @.> Sent: Friday, June 21, 2024 2:38:49 PM To: dusty-nv/NanoLLM @.> Cc: Subscribed @.***> Subject: [dusty-nv/NanoLLM] Import typo in auto_asr.py? (Issue #22)

jetson-containers run --env HUGGINGFACE_TOKEN=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 $(autotag nano_llm) python3 -m nano_llm.agents.web_chat --api=mlc --model meta-llama/Meta-Llama-3-8B-Instruct --asr=riva --tts=piper Namespace(packages=['nano_llm'], prefer=['local', 'registry', 'build'], disable=[''], user='dustynv', output='/tmp/autotag', quiet=False, verbose=False) -- L4T_VERSION=36.3.0 JETPACK_VERSION=6.0 CUDA_VERSION=12.2 -- Finding compatible container image for ['nano_llm'] dustynv/nano_llm:r36.2.0 [sudo] password for nvidia16: Sorry, try again. [sudo] password for nvidia16: localuser:root being added to access control list

docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/nvidia16/Desktop/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-7 --device /dev/i2c-9 -v /run/jtop.sock:/run/jtop.sock --env HUGGINGFACE_TOKEN=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 dustynv/nano_llm:r36.2.0 python3 -m nano_llm.agents.web_chat --api=mlc --model meta-llama/Meta-Llama-3-8B-Instruct --asr=riva --tts=piper /usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead. warnings.warn( The token has not been saved to the git credentials helper. Pass add_to_git_credential=True in this function directly or --add-to-git-credential if using via huggingface-cli if you want to set the git credential as well. Token is valid (permission: read). Your token has been saved to /data/models/huggingface/token Login successful /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn( Fetching 13 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 9119.58it/s] Fetching 17 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 5235.18it/s] 18:19:49 | INFO | loading /data/models/huggingface/models--meta-llama--Meta-Llama-3-8B-Instruct/snapshots/e1945c40cd546c78e41f1151f4db032b271faeaa with MLC Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 18:19:52 | INFO | device=cuda(0), name=Orin, compute=8.7, max_clocks=918000, multiprocessors=8, max_thread_dims=[1024, 1024, 64], api_version=12020, driver_version=None 18:19:52 | INFO | loading Meta-Llama-3-8B-Instruct from /data/models/mlc/dist/Meta-Llama-3-8B-Instruct-ctx8192/Meta-Llama-3-8B-Instruct-q4f16_ft/Meta-Llama-3-8B-Instruct-q4f16_ft-cuda.so 18:19:52 | WARNING | model library /data/models/mlc/dist/Meta-Llama-3-8B-Instruct-ctx8192/Meta-Llama-3-8B-Instruct-q4f16_ft/Meta-Llama-3-8B-Instruct-q4f16_ft-cuda.so was missing metadata ┌─────────────────────────┬─────────────────────────────────────────────────────────────────────────────┐ │ architectures │ ['LlamaForCausalLM'] │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ attention_bias │ False │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ attention_dropout │ 0.0 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ bos_token_id │ 128000 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ eos_token_id │ 128009 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ hidden_act │ silu │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ hidden_size │ 4096 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ initializer_range │ 0.02 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ intermediate_size │ 14336 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ max_position_embeddings │ 8192 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ model_type │ llama │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ num_attention_heads │ 32 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ num_hidden_layers │ 32 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ num_key_value_heads │ 8 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ pretraining_tp │ 1 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ rms_norm_eps │ 1e-05 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ rope_scaling │ │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ rope_theta │ 500000.0 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ tie_word_embeddings │ False │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ torch_dtype │ bfloat16 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ transformers_version │ 4.40.0.dev0 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ use_cache │ True │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ vocab_size │ 128256 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ name │ Meta-Llama-3-8B-Instruct │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ api │ mlc │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ mm_projector_path │ /data/models/huggingface/models--meta-llama--Meta-Llama-3-8B-Instruct/snaps │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ quant │ q4f16_ft │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ type │ llama │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ max_length │ 8192 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ prefill_chunk_size │ -1 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ load_time │ 10.725436296997941 │ ├─────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤ │ params_size │ 3895.7578125 │ └─────────────────────────┴─────────────────────────────────────────────────────────────────────────────┘

18:20:00 | INFO | using chat template 'llama-3' for model Meta-Llama-3-8B-Instruct 18:20:00 | INFO | model 'Meta-Llama-3-8B-Instruct', chat template 'llama-3' stop tokens: ['<|end_of_text|>', '<|eot_id|>'] -> [128001, 128009] 18:20:00 | INFO | Warming up LLM with query 'What is 2+2?' 18:20:02 | INFO | Warmup response: 'Easy peasy!\n\nThe answer to 2+2 is... 4!<|eot_id|>' Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 327, in agent = WebChat(vars(args)) File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 32, in init super().init(kwargs) File "/opt/NanoLLM/nano_llm/agents/voice_chat.py", line 38, in init self.asr = AutoASR.from_pretrained(asr=asr, **kwargs) File "/opt/NanoLLM/nano_llm/plugins/speech/auto_asr.py", line 34, in from_pretrained from nano_llm.plugins.audio.riva_asr import RivaASR ModuleNotFoundError: No module named 'nano_llm.plugins.audio.riva_asr'

— Reply to this email directly, view it on GitHubhttps://github.com/dusty-nv/NanoLLM/issues/22, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVEGK77NVHGX43ITMIEQG3ZIRXLTAVCNFSM6AAAAABJWSIFXWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3DOMBWHAYTGMA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

dusty-nv / NanoLLM

Import typo in auto_asr.py? #22