jbloomAus / DecisionTransformerInterpretability

Interpreting how transformers simulate agents performing RL tasks
https://jbloomaus-decisiontransformerinterpretability-app-4edcnc.streamlit.app/
MIT License
61 stars 15 forks source link

"Algebraic value editing" raises exception #82

Open alexander-turner opened 1 year ago

alexander-turner commented 1 year ago

On default settings, I selected the AVE analysis, and got the following:

AssertionError: This app has encountered an error. The original error message is redacted to prevent data leaks. Full error details have been recorded in the logs (if you're on Streamlit Cloud, click on 'Manage app' in the lower right of your app).
Traceback:

File "/home/appuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
File "/app/decisiontransformerinterpretability/app.py", line 193, in <module>
    show_algebraic_value_editing(dt, logit_dir=logit_dir, original_cache=cache)
File "/app/decisiontransformerinterpretability/src/streamlit_app/causal_analysis_components.py", line 780, in show_algebraic_value_editing
    corrupted_tokens = get_corrupted_tokens(dt, key="avec")
File "/app/decisiontransformerinterpretability/src/streamlit_app/causal_analysis_components.py", line 410, in get_corrupted_tokens
    corrupted_tokens = get_modified_tokens_from_app_state(
File "/app/decisiontransformerinterpretability/src/streamlit_app/environment.py", line 147, in get_modified_tokens_from_app_state
    assert not torch.all(
jbloomAus commented 1 year ago

Ahh, this is the issue where I default to a value that is useful half the time and breaks the other half (setting the instruction to key, but if it's already a key, then corrupt and clean are the same). You can't see the error because streamlit hides errors. I'll update default to something else.

Haven't iterated on AVE stuff recently (it's still only residual stream level). Was hoping to compare to more well established techniques. Have some thoughts on this but it's ongoing.