Closed byerose closed 6 months ago
I tried a small batch size 16. Same results.
Thanks for checking out the code! I actually could not reproduce this error.
Does the problem still persist? If so, please let me know you have tried some debugging or got some more info, e.g., printing optim_slice.stop - optim_slice.start
, len(self.tokenizer)
, etc. The gradient shape [20, 32000]
looks correct.
Given that I could not reproduce it, I'm wondering if it has to do with transformers
version. If you have not already, please make sure that the version is transformers==4.35.2
.
Thanks for checking out the code! I actually could not reproduce this error.
Does the problem still persist? If so, please let me know you have tried some debugging or got some more info, e.g., printing
optim_slice.stop - optim_slice.start
,len(self.tokenizer)
, etc. The gradient shape[20, 32000]
looks correct.Given that I could not reproduce it, I'm wondering if it has to do with
transformers
version. If you have not already, please make sure that the version istransformers==4.35.2
.
Could you please update the requirements.txt file? In current version, using transformers==4.34.1.
len of tokenizer: 32001
Could you please update the requirements.txt file? In current version, using transformers==4.34.1.
Updated! Thank you for catching that. I will close the issue for now assuming that versioning was the issue. Please feel free to reopen if the problem remains.
example_run_gcg.sh
The record is as follows: