abertsch72 / unlimiformer

Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
MIT License
1.05k stars 77 forks source link

Error Encountered While Running 'run_generation.py' Script #48

Open arqumk opened 11 months ago

arqumk commented 11 months ago

I am facing an issue when attempting to run the 'run_generation.py' script from the 'unlimiformer/src' directory. This script is crucial for my work with Unlimiformer on Google Colab, and I need assistance in resolving the problem.

When executing the following command:

python src/run_generation.py --model_type llama --model_name_or_path meta-llama/Llama-2-13b-chat-hf \
    --prefix "<s>[INST] <<SYS>>\n You are a helpful assistant. Answer with detailed responses according to the entire instruction or question. \n<</SYS>>\n\n Summarize the following book: " \
    --prompt example_inputs/harry_potter_full.txt \
    --suffix " [/INST]" --test_unlimiformer --fp16 --length 200 --layer_begin 16 \
    --index_devices 1 --datastore_device 1
 the script runs on Google Colab, but it abruptly terminates with the following error message:
    10/05/2023 11:16:12 - WARNING - __main__ - device: cuda, n_gpu: 1, 16-bits training: True
    Using pad_token, but it is not set yet.
   ^C 

The script utilizes the 'cuda' device, and I have one GPU ('n_gpu: 1') available for the process. I am enabling 16-bits training with the '--fp16' flag. The script also specifies various parameters, including 'length,' 'layer_begin,' 'index_devices,' and 'datastore_device.' The issue arises when running the script but does not provide a clear indication of the problem's root cause.

bunnyfu commented 9 months ago

I think you should try in your prompt --index_devices 0 --datastore_device 0