pytorch / captum

Model interpretability and understanding for PyTorch
https://captum.ai
BSD 3-Clause "New" or "Revised" License
4.96k stars 499 forks source link

Requesting Tutorial - Interpreting BERT Models (Part 2) #502

Closed xiaoda99 closed 3 years ago

xiaoda99 commented 4 years ago

📚 Documentation

Interpreting BERT Models (Part 1) (https://captum.ai/tutorials/Bert_SQUAD_Interpret) . It is mentioned at the end of this tutorial that "In the Part 2 of this tutorial we will to go deeper into attention layers, heads and compare the attributions with the attention weight matrices, study and discuss related statistics". That's exactly the topic I want to read but I just can't find Part 2 anywhere on the web. If Part 2 hasn't been finished yet when will it be availlable?

NarineK commented 4 years ago

Hi @xiaoda99, thank you for reminding us about it :) We'll find some time and add part 2. What type of Bert tasks are you working on ? If you find any papers or resources related to that, feel free to post it here :) Thank you!

xiaoda99 commented 4 years ago

I'm working on attributing model predictions to attention weights of ALL self-attention layers, i.e. to see which attention links are more important for model predictions. The preferred attribution method is integrated gradients. Two related papers that I want to follow (with relevant section numbers): [1] Hao, Y., Dong, L., Wei, F., & Xu, K. (2020). Self-Attention Attribution: Interpreting Information Interactions Inside Transformer. arXiv preprint arXiv:2004.11207. (Section 4) [2] Cui, L., Cheng, S., Wu, Y., & Zhang, Y. (2020). Does BERT Solve Commonsense Task via Commonsense Knowledge?. arXiv preprint arXiv:2008.03945. (Section 4.2-5.2)

NarineK commented 4 years ago

Thank you @xiaoda99, I'll look into them.

NarineK commented 3 years ago

Bert tutorial part 2 PR: https://github.com/pytorch/captum/pull/593

NarineK commented 3 years ago

Closing, addressed in PR: #593