-
When Megatron-DeepSpeed support llama3/llama3.1 pretraining?
-
should have a port to upload image
-
-
-
Hi,
First of all, thanks for your work.
I tried out the llama3.2-vision:90b model on ollama and it seems to underperform the version available on the build.nvidia.com API, with the same prompt …
-
## Description
- Unsupported data format during lowering from TTForge to TTIR: Bfp2_b. unsupported data fromat assertion from lower_to_mlir.cpp
`RuntimeError: TT_ASSERT @ /proj_sw/user_dev/mramanath…
-
### System Info
- `transformers` version: 4.45.1
- Platform: Linux-5.4.247-162.350.amzn2.x86_64-x86_64-with-glibc2.26
- Python version: 3.10.12
- Huggingface_hub version: 0.24.0
- Safetensors v…
-
LongViLa-LLama3-1024Frames output is often repetitive. Why does this happen, and are there any suggestions to reduce the repetition?
-
**Is your feature request related to a problem? Please describe.**
I was taking a look into Karapthy's lama3.c single file and found something similar in java
https://github.com/mukel/llama3.j…
-
To get this to work, first you have to get an external AMD GPU working on Pi OS. The most up-to-date instructions are currently on my website: [Get an AMD Radeon 6000/7000-series GPU running on Pi 5](…