CASE-Lab-UMD / LLM-Drop

The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
Apache License 2.0
57 stars 5 forks source link

How to assess the importance of the module? #4

Open sunkun1997 opened 12 hours ago

sunkun1997 commented 12 hours ago

Do you provide an interface to assess the importance of the model? I can't find any imformation about this from README. Should I calculate the cosine similarity of input and output by myself?

Shwai-He commented 4 hours ago

Hello, thank you for reaching out. There's no need to calculate the similarity manually. We have provided base files for block drop and layer drop, which will automatically handle similarity calculations and generate new model configuration files after the layers are dropped. If you have any further questions, feel free to let us know!