-
### Describe the issue
I was running this notebook from Autogen's RAG example:
def termination_msg(x):
return isinstance(x, dict) and "TERMINATE" == str(x.get("content", ""))[-9:].upper()
boss…
-
**Describe the bug**
We followed [Accelerated-RLHF.md](https://github.com/NVIDIA/NeMo-Aligner/blob/v0.3.0.trtllm/Accelerated-RLHF.md) to run the accelerate the PPO training by using TensorRT-LLM. A…
-
Hey unsloth team, beautiful work being done here.
I am the author of [MachinaScript for Robots](https://github.com/babycommando/machinascript-for-robots) - a framework for building LLM-powered robo…
-
Hi,
First of all, thank you for your excellent work on this project. I tried to run your code on my machine, but I encountered an error that seems to indicate an issue with loading the model from a n…
-
### What happened?
LLaMA 3 has been trained 8192 context. When using a single slot with the llama.cpp HTTP server this slot is assigned the full 8192 context. However, when using multiple slots and n…
-
@Yuliang-Liu Nice work! I run into finetune issue as follows
![e20823216a167f50ed234a1468f8a51](https://github.com/user-attachments/assets/48f6316a-0098-4d9c-a34f-ec32c60a48d1)
2 GPUs of NVIDIA A8…
-
Hi,
Since it is common to use with deepspeed zero w/ offloading when training large LLM, does TE currently support in this mode?
Currently deepspeed support is just unittest as refered by TE's r…
-
### Title
NER on patent data
### Team Name
Octane
### Email
202318005@daiict.ac.in
### Team Member 1 Name
Aditi singh
### Team Member 1 Id
202318005
### Team Member 2 Name
Hani Soni
### Te…
-
Hello DLRover team:
I'm a member of the LF AI&Data TAC and voted for this project to make it to the sandbox as it's very interesting, especially regarding the optimizations you implement.
I'm al…
-
Hey @acon96, It seems to me that this is not a bug, but an incorrect configuration. I try to run `train.py` script with some params, on my 1 x RTX4090 graphic card. I try to train `llama-3.1-8B` model…