UFO-101 / auto-circuit

A library for efficient patching and automatic circuit discovery.
https://UFO-101.github.io/auto-circuit
34 stars 13 forks source link

Input wise masks for mask gradients #4

Closed oliveradk closed 4 months ago

oliveradk commented 4 months ago

Adds support for computing input-wise mask gradients. Useful for e.g. doing anomaly detection using edge attribution scores

oliveradk commented 4 months ago

Thanks for the feedback! I mostly implemented your suggestions, but instead of adding a dimension to patch_mask by default, only add the batch dimension if set_mask_batch_size is called - this minimizes the chance of any downstream bugs being introduced, and I don't think any of the edge indexing functionality is critical if the main use case is to collect attribution scores over batches (please let me know if I'm missing something crucial there)

UFO-101 commented 4 months ago

I will just do a couple of checks locally and then merge in next hour or so

UFO-101 commented 4 months ago

@oliveradk Could you please fix merge conflicts and final comment? Then I will merge.

oliveradk commented 4 months ago

Fixed merged conflicts and addressed comment (I had been running the test incorrectly and had to make some more tweaks)