-
### OS
Linux
### GPU Library
CUDA 12.x
### Python version
3.10
### Pytorch version
2.4.1+cu121
### Model
Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-21-09-24
### Describe the bug
```
File "/mn…
-
This comes up in NeMo / NeVA:
https://github.com/NVIDIA/NeMo/blob/32503fd946cedc41152152837c01f95ae4bc6dc6/nemo/collections/nlp/modules/common/megatron/attention.py#L973-L973
cc @tfogal
-
## Summary
https://github.com/reitzig/sdkman-for-fish/blob/555203d56e534d91cde87ad600cbbf6f2d112a03/README.md?plain=1#L28-L40
despite the description in https://github.com/reitzig/sdkman-for-fis…
-
### Motivation
The best multilingual open source small LM.
https://mistral.ai/news/mistral-nemo/
### Related resources
_No response_
### Additional context
_No response_
lihan updated
4 months ago
-
**Describe the bug**
I'm trying to follow the [SFT Tutorial](https://docs.nvidia.com/nemo-framework/user-guide/latest/modelalignment/sft.html#prerequisite), on a Llama-3-8b LLM and the process fail…
-
I downloaded `train-00000-of-00001.paquet` and `validation-00000-of-00001.paquet` from Google Cloud and converted them to `.npy` files using `scripts/convert_dataset.py`.
However, when using them …
-
### Distribution
Debian unstable
### Package version
6.2.8-1
### Frequency
Always
### Bug description
Hey.
(I think none of the other segfault issues already reported are the same one than t…
-
**Describe the bug**
I was training to run sft based on Mixtral-8x7B-instruct model with tensor parallel size=4 (sequence parallel=True) and LoRA (target modules =[all]).
It reports that the output …
-
While preparing the benchmark for eager and dynamo using the code from the fork: https://github.com/tfogal/NeMo I get errors for dynamo case.
## 🐛 Bug
After fixing [1187](https://github.com/Ligh…
-
### 📦 Environment
Docker
### 📌 Version
v1.31.5
### 💻 Operating System
macOS
### 🌐 Browser
Chrome
### 🐛 Bug Description
When creating a new assistant in LobeChat and setting its avatar, name, …