Open prakamya-mishra opened 3 weeks ago
Hi @FranxYao if I use your code for evaluating Llama-3.2-3B model, specifically:
scaling_factor = 10 # hardcode reset_rope(self.model_to_test, model_max_train_len=81920, scaling_factor=scaling_factor)
It throws the following error:
AttributeError: 'LlamaRotaryEmbedding' object has no attribute '_set_cos_sin_cache'
So if I comment this part out then I get the following results:
This is unexpected as Llama-3.2-3B model claims to support a context length up to 128K. Do you also get this erro? or how do you handle this?
What should be the the correct way to evaluate Llama-3.2-3B model?
I downloaded the llama model using:
from huggingface_hub import snapshot_download snapshot_download(repo_id='meta-llama/Llama-3.2-3B', local_dir='<path>/Llama-3.2-3B', repo_type='model', local_dir_use_symlinks=F
And run command is:
( python -u needle_in_haystack.py --s_len 0 --e_len 128000\ --model_provider LLaMA\ --model_path <path>/Llama-3.2-3B ) 2>&1 | tee logs/Llama-3_2-3B.log
Hi @FranxYao if I use your code for evaluating Llama-3.2-3B model, specifically:
It throws the following error:
So if I comment this part out then I get the following results:
This is unexpected as Llama-3.2-3B model claims to support a context length up to 128K. Do you also get this erro? or how do you handle this?
What should be the the correct way to evaluate Llama-3.2-3B model?
I downloaded the llama model using:
And run command is: