Closed MarsMeng1994 closed 5 months ago
Hi! Thank you for this issue.
Pls set the --pretraining_length 4096
for Llama2.
python retrieve_needle.py --max_length 192k --pretraining_length 4096 --model llama --dca
The default value is set to 32k due to my latest experiment on Mistral. Sorry for the inconvenience.
Hi! Thank you for this issue. Pls set the
--pretraining_length 4096
for Llama2.python retrieve_needle.py --max_length 192k --pretraining_length 4096 --model llama --dca
The default value is set to 32k due to my latest experiment on Mistral. Sorry for the inconvenience.
sorry, i forget to say but my --pretraining_length is already 4096
oh, i find the problem
it default use replace_with_chunkmistral
Hahaha I think so! Does it work now?
I will close this issue for now. Should you have any further questions or concerns, please don't hesitate to reopen it.
9k is going to not work
model: Llama-2-7b-hf transformers: 4.37.2 torch: 2.0.1 flash-attn: 2.5.2
my cmd: python retrieve_needle.py --max_length 192k --model llama --dca