Write a check to look at layer weight norms at initialization on the architecture, maybe visualize in a bar chart.

jbloomAus / DecisionTransformerInterpretability

Interpreting how transformers simulate agents performing RL tasks

https://jbloomaus-decisiontransformerinterpretability-app-4edcnc.streamlit.app/

MIT License

61 stars 15 forks source link

Write a check to look at layer weight norms at initialization on the architecture, maybe visualize in a bar chart. #63

Closed jbloomAus closed 1 year ago

jbloomAus commented 1 year ago

I'm nervous that I haven't been disciplined enough with the initialization of the weights in my models. I should research how people think about initialization and what to do about it.