Does the euclidean_distance accumulated or not? Why?

zjunlp / EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

https://zjunlp.github.io/project/KnowEdit

MIT License

1.88k stars 232 forks source link

Does the euclidean_distance accumulated or not? Why? #355

Closed LMRinGithub closed 1 month ago

LMRinGithub commented 2 months ago

In the paper, I saw it says 'we consider the toxic layer to be the transformer layer that most effectively separates the distributions of safe and unsafe sequences'

Inside code, I saw `
for layer_index in range(1, len(hidden_states)): euclidean_distance = torch.dist( hidden_states[layer_index][j 2], hidden_states[layer_index][j 2 + 1], p=2)

            if euclidean_distance.item() > max_distance_value:
                max_distance_value = euclidean_distance.item()
                max_distance_layer = layer_index
        toxic_layer.append(max_distance_layer-1)`

I have a question there: Does the euclidean_distance accumulated or not?

mengrusun commented 2 months ago

What does "accumulated" mean in your issue? As you can see in the code, we calculate the Euclidean distance for each layer separately, and then select the layer with the maximum Euclidean distance.

zxlzr commented 1 month ago

Hi, have you solved your issue yet?

LMRinGithub commented 1 month ago

What does "accumulated" mean in your issue? As you can see in the code, we calculate the Euclidean distance for each layer separately, and then select the layer with the maximum Euclidean distance.

For accumulated, I mean maybe the difference is more important than the absolute value. For instance, Layer 1 Euclidean Distance: 0.1 Layer 2 Euclidean Distance: 0.8 Layer 3 Euclidean Distance: 0.81 Layer 4 Euclidean Distance: 0.7 If we use the method proposed by the paper, we will choose Layer 3 as the toxic layer. However, the difference between Layer 3 and Layer 2 is only 0.01, while the difference between Layer 2 and Layer 1 is 0.7, which is much larger than the chosen one. So I consider the weight between Layer 1 and Layer 2 is more important than the weight between Layer 2 and Layer 3. What do you think about this?

mengrusun commented 1 month ago

We use top 1 rank in our paper, i.e., Layer 3 in your toy example.

We acknowledge that the top-1 strategy is quite simple and may not be optimal. Our choice of this strategy was based on the following considerations: although information accumulates from the bottom layers, the greatest differentiation occurs at the top-1 layer. By directly operating on the top-1 layer, it's akin to adding a guardrail."

Your idea is great, and it might yield even better results. Just a friendly reminder: if you're considering multiple layers, please carefully select appropriate hyperparameters and update strategies. The more changes made to the original model, the more likely it is to introduce unintended side effects.

By the way, this paper (ReFT: Representation Finetuning for Language Models) examines the integration of different layers, which might serve as a valuable reference for your work.

Looking forward to your future contributions to this field.

LMRinGithub commented 1 month ago

Thank you for your response. It answered my question.