Open Bing-a-ling7 opened 1 month ago
It's probably due to the version of transformers. The released code is worked with transformers==4.34.1
. In this version, .generate
doesn't output past_key_values
, so we modify it in utils/cached_models.py
to support output past_key_values
.
In the latest version of transformers, .generate
supports output past_key_values
. So there might be some conflicts with utils/cached_models.py
.
You can try (1) remove utils/cached_models.py
and use past_key_values
outputted by the official code of the latest transformers, or (2) downgrade the version to 4.34.1.
When I execute the command
bash scripts/gsm8k/generate.sh
, I usedset_trace
to debug the_sample_tokens_with_calculator
function. An error occurs when executing the following line:Due to insufficient GPU memory, I used
TinyLlama/TinyLlama-1.1B-Chat-v1.0
as a replacement. However, when I comment outpast_key_values=past_key_values
, the error disappears. What is that error?