-
### š Describe the bug
I'm trying to train LLaMA model with all linear layers + embeddings and head.
Whilst embeddings have no problems with FSDP over Liger, there always exceptions when [ lm_headā¦
-
Did you ever run forward on the LM with just motion to motion or text to text or is this (below) the stage 1 of training as described in the paper.
From Paper:
"To generalize to various downstreā¦
-
I'm attempting to train LLaMA-3 using Megatron-LM but have encountered an issue: LLaMA-3 utilizes Tiktoken for tokenization and doesn't provide a tokenizer.model file, which is required by Megatron-LMā¦
-
I am having an [issue](https://github.com/evilsocket/cake/issues/36) with deploying Qwen2.5-Coder with the WIP program [Cake](https://github.com/evilsocket/cake).
The following error happens when uā¦
-
# I wana evaluate the precision of the gguf model using llama.cpp as inference framework
## use these commands:
./llama-server -m /root/ICAS_test/models/Qwen-1_8B-Q8_0.gguf
lm_eval --model gguf ā¦
-
File "/home/lm/OpenFedLLM-main/main_dpo.py", line 109, in
results = trainer.train()
File "/home/lm/yes/envs/opfl/lib/python3.10/site-packages/transformers/trainer.py", line 1539, in train
ā¦
-
## Description
As suggested by @alvarobartt, it would be nice to integrate [`mlx-lm`](https://pypi.org/project/mlx-lm/)
-
running in colab, loading model from drive with:
spirit_lm = Spiritlm("/content/drive/MyDrive/data/checkpoints/spiritlm_model/spirit-lm-expressive-7b")
then, running generation step from standaā¦
-
Hi,
I have a question about the prev() function in LinkeeQueue.hpp file
``` c++
template
bool LM_LinkedList::prev() {
if (length == 0)
return false;
if (curr->prev != nullptā¦
-
Hi, I got an error when I ran the lm-eval command:
`Traceback (most recent call last):
File "/vol3/ctr/.conda/envs/hzx1/bin/lm_eval", line 8, in
sys.exit(cli_evaluate())
File "/vol3/ctr/lā¦