Closed stefan-apollo closed 4 months ago
Graphs look very similar (bug?) and ablation curve look sus compared to old ones
The code is definitely running the gradient-flow fashion, always computing [current_node_layer, final_node_layer]. Thus there cannot be any nearest neighbour computation going on. I'd rule out simple bugs.
Hmm ablation curves look sus as well, and unlike those in #141
Still happens even if I use the old (1-alpha)^2 basis
Lambdas do seem to shift differently between normal and NGF though
Cosine similarity of bases:
Run with lots of neighbouring node layers:
I've cleaned up the code a bit and removed all the lies. Changes since review:
check_outfile_overwrite
logic to _get_out_file_path
and use the new function in NGFconfig.naive_gradient_flow = True
even if config.interaction_matrices_path is not None
results.config
from results.config
set it from config
(requires less changes)
results.config = replace_pydantic_model(config, {"calculate_edges": False})
if config.calculate_edges
, run rib_build without save file and then add edges to existing results and save manually
Description: Naive implementation of the gradient flow method, based on looping rib_build a few times. Approximately slows down the code by num_node_layers/2.
Tested: Did comparison plots below. They look much closer to NNIB than expected (from #141 plots) but we can't see anything broken. Also implemented tests
No breaking changes. Doesn't work on MLP because our MLP implementation requires the node_layers to be a strict sub-sequence of model layers, which cannot be done in naive gradient flow. Added validation for this.
Mod add example:
With naive gradient flow:
With nearest neighbour:
![image](https://github.com/ApolloResearch/rib/assets/148209923/6089217f-822d-4e86-9ce0-bc28bad6136b)