HKUNLP / ChunkLlama

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
Apache License 2.0
275 stars 14 forks source link

Perplexity validation on PG19 error and Passkey Retrieval error #19

Open khfs opened 1 week ago

khfs commented 1 week ago

I followed the environment setup in the readme exactly. When performing Perplexity validation on PG19, the only difference from the original code is that I loaded the model from a local path and set the device to 'cpu' to see the exact error messages. My command line was:

python test_ppl.py --seq_len 16384 --scale 7b --data_path pg19_llama2.validation.bin

The terminal output was:

The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored. Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.88it/s] Test PPL on seq length 16384 0%| | 0/9446 [00:00<?, ?it/s] Traceback (most recent call last): File "test_ppl.py", line 102, in evaluate_ppl_all(seq_length=args.seq_len, sliding_window=256, args=args, model=model, data=data) File "test_ppl.py", line 58, in evaluate_ppl_all outputs = model( File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1183, in forward outputs = self.model( File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1027, in forward inputs_embeds = self.embed_tokens(input_ids) File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 163, in forward return F.embedding( File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/torch/nn/functional.py", line 2264, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) IndexError: index out of range in self

When performing Passkey Retrieval, the only difference from the original code is that I loaded the model from a local path. My command line was:

python test_passkey.py --seq_len 16384 --scale 7b

The terminal output was: Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 11.13it/s] Traceback (most recent call last): File "test_passkey.py", line 123, in main(args) File "test_passkey.py", line 77, in main model = load_checkpoint_and_dispatch(model, checkpoint=model_path, File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/accelerate/big_modeling.py", line 607, in load_checkpoint_and_dispatch load_checkpoint_in_model( File "/data4/xylu/miniconda3/envs/chunkllama/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 1705, in load_checkpoint_in_model raise ValueError( ValueError: /data3/xylu/checkpoints/NousResearch/Llama-2-7b-hf containing more than one .index.json file, delete the irrelevant ones.

ChenxinAn-fdu commented 6 days ago

Hi ! Have you verified your code without the chunkLlama monkey patch?

khfs commented 6 days ago

Yes, I commented out line 85 in test_ppl.py and got the same error message. Later, after comparing this code with the eval.py code of LongLoRA, I found that changing np.uint32 to np.uint16 on line 98 of test_ppl.py allowed the code to run. However, the result of running CHUNKLLAMA2 7B was {"seq_len": 16384, "gpu": "1", "data_path": "pg19_llama2.validation.bin", "scale": "7b", "pretraining_length": 4096, "ppl": 1803.4413082318101}. Obviously, this result is not correct, and I don't know what other issues exist in the code.

ChenxinAn-fdu commented 5 days ago

Thank you for letting me know! I think this issue is caused by mistakenly uploading the files using Llama3 tokenizer. I will check it right now.

ChenxinAn-fdu commented 5 days ago

Hi! Changing data = {'val': np.memmap(data_path, dtype=np.uint32, mode='r')} -> data = {'val': np.memmap(data_path, dtype=np.uint16, mode='r')} works for me. Remember not to comment out this line: replace_with_chunkllama(args.pretraining_length, args.pretraining_length//4)

khfs commented 5 days ago

Although this allows the code to run, the result I obtained is {"seq_len": 16384, "gpu": "1", "data_path": "pg19_llama2.validation.bin", "scale": "7b", "pretraining_length": 4096, "ppl": 1803.4413082318101}, where the PPL is too high. Therefore, I believe there is still an issue with the code. I am curious about your results.

ChenxinAn-fdu commented 5 days ago

I have updated the code. Plz try the newest version🤣.

khfs commented 5 days ago

Thank you for your response regarding the validation on PG19 and I am currently testing the latest version of the code. How can I resolve the error related to the passkey retrieval?

ChenxinAn-fdu commented 5 days ago

test_passkey.py has also been updated!