PhilipQuirke / verified_transformers

Tool used to verify accuracy of transformer model
Apache License 2.0
1 stars 1 forks source link

Extend: Consider integrating ACDC #8

Open PhilipQuirke opened 5 months ago

PhilipQuirke commented 5 months ago

The Automated Circuit Discovery (ACDC) library (https://github.com/ArthurConmy/Automatic-Circuit-Discovery) contains code that analyses transformer models, detects which nodes depend on which nodes and graphs the results. This "node dependency chain" information would help a researcher understand a transformer model. Filtering can be extended to cover "time-ordering" dependencies such as "D3.ST4 node depends on D2.ST3 node".

This issue covers:

(Arthur's code currently (I believe) uses a locally modified version of Transformer Lens so to fully integrate his and our code base likely involves 1) retrofitting his Transformer Lens changes back into the mainstream Transformer Lens library 2) simplifying the ACDC library to use the (newly improved) mainstream Transformer Lens library 3) importing his (shrunken) library into our library. Only if Arthur is willing to help us in some way should we bring this work into scope, as a separate ticket.)