-
I did dare experiments on mergekit before, then I found your paper also says to experiment with mergekit, can you teach me how to run it?
The example.yml in dare is written as follows, how do I c…
-
Hi, I got a training curve like this, is it normal? Do you mind sharing your trainer_state.json? thx!
-
Dear Author, is it possible to provide correct lit_model.pth file? Or a guideline to generate correct lit_model.pth file would be a great help.
Thank you.
-
`RuntimeError: FlashAttention backward for head dim > 64 requires A100 or H100 GPUs as the implementation needs a large amount of shared memory.`
Are it referring to the head dimension of vicuna-7b b…
-
![1720428897820](https://github.com/xinke-wang/ModaVerse/assets/173769345/b8fbb6d0-c647-43a7-b4db-ed70d34deac9)
this code is for vicuna-7b-delta-v1.1, then which code is for vicuna-7b-delta-v0?
and …
-
I am trying to calculate the acceptance rate for evaluation. I ran:
`python -m eagle.evaluation.gen_ea_alpha_vicuna --ea-model-path ~/models/EAGLE-Vicuna-7B-v1.3/ --base-model-path ~/models/vicuna-…
-
Thank you for your excellent work. The 'magicr/vicuna-7b' seems to be your private repository. I would like to know if it is different from other vicuna models.Thanks!
-
cd /workspace/Pai-Megatron-Patch/examples/llava/
sh run_pretrain_megatron_llava.sh \
dsw \
/workspace/Pai-Megatron-Patch \
7B \
4 \
32 \
1e-3 \
1e-4 \
2048 \
2048 \
0 \
bf16 \
…
-
Hello, it's a great work for me! I try to download model on huggingface.
# Load model directly
from transformers import AutoTokenizer
address = "./src/model/llaga-simteg-HO-classification"
confi…
-
### System Info
I am getting the following error, but this error should not be there -
cannot import name 'ShardedDDPOption' from 'transformers.trainer'
I have the following versions installed - …