Glaciohound / LM-Infinite

Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
https://arxiv.org/abs/2308.16137
MIT License
124 stars 13 forks source link

limited_distance_forward() got an unexpected keyword argument 'padding_mask' #3

Closed dittops closed 11 months ago

dittops commented 1 year ago

I'm trying to run the eval script.

PYTHONPATH=. deepspeed --include localhost:$CUDA_VISIBLE_DEVICES --master_port $MASTER_PORT scripts/eval_downstream_tasks.py     --deepspeed_config configs/zero3_efficient_config.json     --model meta-llama/Llama-2-7b-hf --tokenizer_path meta-llama/Llama-2-7b-hf     --use_lambda_attention --local_branch 4096 --global_branch 100 --limit_distance 4096     --dataset passkey_retrieval --dataset_dir ${PASSKEY_DATA} --dataset_group ${MAX_LENGTH}     --max_generation_length 10 --evaluate_metrics     --log_dir $LOG_DIR/$TRIAL
image
Glaciohound commented 1 year ago

Hi, this is due to recent updates in modeling_llama.py in Huggingface Transformers. We updated the codes (and ignored padding mask for now, as it is newly introduced, not frequently used and hard to be compatible).

Feel free to pull the newest codes. ^_^