-
### System Info
llama3 released
https://huggingface.co/collections/meta-llama/meta-llama-3-66214712577ca38149ebb2b6
https://github.com/meta-llama/llama3
### Who can help?
@ncomly-nvidia
### …
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and f…
-
Arraymancer has become a key piece of Nim ecosystem. Unfortunately I do not have the time to develop it further for several reasons:
- family, birth of family member, death of hobby time.
- competin…
-
Dear,
I'm quite struggling to make sample code works on my laptop with a Nvidia A2000(8GB) card.
Does anyone has an advice?
RuntimeError: Expected all tensors to be on the same device, but …
-
### Search before asking
- [X] I have searched the YOLOv5 [issues](https://github.com/ultralytics/yolov5/issues) and [discussions](https://github.com/ultralytics/yolov5/discussions) and found no simi…
-
# Prerequisites
Hi there,
I am finetuning the model `https://huggingface.co/jphme/em_german_7b_v01` using own data (I just replaced the questions and answers by dots to keep it short and simple). …
Lue-C updated
4 months ago
-
If I got it right, we should convert Llama3 with "convert-hf-to-gguf.py". This uses a ton of memory and my Mac Studio M1 Ultra with 128GB VRAM is unable to convet Llama3-70b to f32. Luckily it worked …
-
### 🐛 Describe the bug
I'm trying to use NNCF for a rec sys model to quantize it to int8. Before using it on our production model, I wanted to get it working on a simple toy example first but am seei…
-
### 🚀 The feature, motivation and pitch
# Summary
We would like to support the 4-bit KV cache for the decoding phase. The purpose of this feature is to reduce the GPU memory usage of the KV cache wh…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar feature requests.
### Description
1.58 bit quantization i…