SchwinnL / LLM_Embedding_Attack

Code to conduct an embedding attack on LLMs
19 stars 3 forks source link

Question about tokenizer and attention mask implementation #2

Closed krishnakanthnakkav2 closed 3 months ago

krishnakanthnakkav2 commented 3 months ago

Hi,

Thank you for releasing the code. I have questions regarding padding and attention mask implementation.

  1. How would you ideally want to set the tokenizer's padding side as? Is it to left side or right side? At, https://github.com/SchwinnL/LLM_Embedding_Attack/blob/95ff15f033a62ba6aa948533cd36f4234776bf37/unlearning_utils.py#L101 the code forces the padding side to left, but is it for all models?

  2. In the get_attention_mask() calculation, https://github.com/SchwinnL/LLM_Embedding_Attack/blob/95ff15f033a62ba6aa948533cd36f4234776bf37/unlearning_utils.py#L137, the mask is set to True when the token id is not equal to 0. Shouldnt we check with tokenizer.pad_token_id instead of 0? Why is 0 used to set the attention mask? This has critical impact on the loss function in the way we set the attention mask. I printed the attention mask for first input and I can see all the indexes are set to True.

Thank you for answering.

krishnakanthnakkav2 commented 3 months ago

Other question is that,

At https://github.com/SchwinnL/LLM_Embedding_Attack/blob/95ff15f033a62ba6aa948533cd36f4234776bf37/unlearning_utils.py#L218 we tokenize each column, how is padding done here? to the left or right for question column and affirmative response column?

SchwinnL commented 3 months ago

Hi,

we generally had 0 has a pad token (or set the pad token to the 0 token) for the models we used. I will adapt the code to mask with the pad token of the tokenizer after the NeurIPS rebuttal.

Since we need to generate in an autoregressive manner for the final evaluation, left-padding makes the most sense. For optimization, right-padding is fine as well.

Does this answer your questions?

krishnakanthnakkav2 commented 3 months ago

Thanks for prompt respondse.

since the code prepares the prompt and response in the following order

full_embeddings = torch.hstack([embeddings_input, embeddings_attack, embeddings_target])

I think during training, I think embeddings_input has to be left padded , embeddings_target has to be right padded (so that all padding are at the end of the generation and the loss is ignored) .

If embeddings_target is also left-padded, there are weird padding tokens in between attack embedding and target embeddings.

I don't know if this of strong concern. But just wanted to know your thoughts.

SchwinnL commented 3 months ago

The attack is always the same size. There is no need for padding.

In our experiments, the resulting generations where fine either way but I agree it makes sense to do right-padding for the targets. Maybe this would increase the fluency of the generations which are sometimes nonsense after the target sequence was generated.

A good solution would be to concat question and target and do the left padding. We can later insert the suffix embedding attack. Please don't hesitate to let me know if you have additional issues/questions.