HKUNLP / ChunkLlama

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
Apache License 2.0
341 stars 18 forks source link

needle test with llama2 #11

Closed MarsMeng1994 closed 5 months ago

MarsMeng1994 commented 5 months ago
image image

9k is going to not work

model: Llama-2-7b-hf transformers: 4.37.2 torch: 2.0.1 flash-attn: 2.5.2

my cmd: python retrieve_needle.py --max_length 192k --model llama --dca

ChenxinAn-fdu commented 5 months ago

Hi! Thank you for this issue. Pls set the --pretraining_length 4096 for Llama2. python retrieve_needle.py --max_length 192k --pretraining_length 4096 --model llama --dca The default value is set to 32k due to my latest experiment on Mistral. Sorry for the inconvenience.

MarsMeng1994 commented 5 months ago

Hi! Thank you for this issue. Pls set the --pretraining_length 4096 for Llama2. python retrieve_needle.py --max_length 192k --pretraining_length 4096 --model llama --dca The default value is set to 32k due to my latest experiment on Mistral. Sorry for the inconvenience.

sorry, i forget to say but my --pretraining_length is already 4096

image
MarsMeng1994 commented 5 months ago

oh, i find the problem

image

it default use replace_with_chunkmistral

ChenxinAn-fdu commented 5 months ago

Hahaha I think so! Does it work now?

ChenxinAn-fdu commented 5 months ago

I will close this issue for now. Should you have any further questions or concerns, please don't hesitate to reopen it.