optim_str tokenization issue

hi authors, thanks for the great work!

I have a question regarding the tokenization process implemented in the repo. It appears that the before_ids, target_ids, after_ids, and optim_str_ids are tokenized separately. However, when reintegrating optim_str back into the original messages and performing tokenization again, the token IDs for the optim_str segment may differ from those generated when optim_str is tokenized independently, without preceding context.

Would this be fixed in the later version?

Thanks!

GraySwanAI / nanoGCG

optim_str tokenization issue #24