Different Bert Models - Githubissues

hila-chefer / Transformer-Explainability

[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

MIT License

1.75k stars 232 forks source link

Different Bert Models #37

Closed Christoforos00 closed 2 years ago

Christoforos00 commented 2 years ago

Hi, thanks for you paper and repo.

I noticed that in bert_pipeline.py you use a different Bert version for each explainability method. https://github.com/hila-chefer/Transformer-Explainability/blob/71d6844afe10b0bf7e70a04d0bb55c7ef4127fac/BERT_rationale_benchmark/models/pipeline/bert_pipeline.py#L422

In the above link there is BertForSequenceClassificationTest and BertForClsOrigLrp. Why is this done?

Thank you.

hila-chefer commented 2 years ago

Hi @Christoforos00, thanks for your interest!

The different implementations contain different versions of the code- with or without LRP, and with our modified LRP rules. For baselines that do not use LRP, there’s no point in adding the relprop function for the model layers. In order to compare with baselines that use the original LRP rules without our customization, we have a separate implementation with the original LRP rules for the relprop function.

I hope this clarifies things a bit.