jbloomAus / DecisionTransformerInterpretability

Interpreting how transformers simulate agents performing RL tasks
https://jbloomaus-decisiontransformerinterpretability-app-4edcnc.streamlit.app/
MIT License
61 stars 15 forks source link

Added attention pattern and logit lens analyses for patched activations #98

Closed JayBaileyCS closed 11 months ago

JayBaileyCS commented 11 months ago

image

Performed:

Tested: