Closed stefan-apollo closed 4 months ago
I notice load_interaction_rotations
and load_mean_vectors_and_gram_matrices
may be able to be the same function somehow but that's too much for me right now
It would be nice to assert that all the configs match
Todo: Test that the verification will not give warnings now when I run things properly
Edit: Especially when I run gradient flow
Existing issues with verify function:
Separate gram loader
Description
Allow a separate dataset to be used for the gram matrix computation than for the RIB basis computation.
I also allow using a tokenized dataset rather than untokenized dataset to skip the (kinda slow) tokenization.
Also added an option to store the computed gram matrix to a file, that code doesn't feel super great and it'll need to be merged with #333 but it's there!
Motivation and Context
We noticed that the gram (PCA) dataset size is a lot more sensitive to amount of samples, and also a lot cheaper.
How Has This Been Tested?
Did runs, and scaling plots. Added a test making sure this config option runs.
Does this PR introduce a breaking change?
No. Not giving a gram_dataset defaults to using the same dataset as for the Cs.