Closed Daniel030117 closed 6 months ago
Hi, @Daniel030117 thanks for your interest in our work.
Do you have the code for calculating memorization scores? Yes, the memorization score, which is defined as self-influence, is computed in compute_if.py
In the 'compute_if_attr' within CIFAR, where is the path specified in the filename variable? Do you mean the pre-computed memorization attributions? You can find them in the score_42 folder
And, if you are interested in estimating memorization (self-influence) scores, I suggest using a new method called TRAK rather than the original Influence Functions nowadays.
You may take a look at this repo LLM-TRAK.
The filename path here is in saved/random/0/42/checkpoint, but I couldn't find the checkpoint folder.
We did not share the model checkpoints because they are too large.
We only shared the pre-computed memorization attributions.
You can obtain the checkpoints by conducting the training by yourself.
CUDA_VISIBLE_DEVICES=0 python -u train.py --SEED=42 --SAVE_CHECKPOINT=True > log/random/0/log_seed_42.txt
first mkdir -p log/random/0
as shown in run_random_0.sh.
:)
Please first follow the README.md ...
Download the CIFAR-10, SNLI, SST, Yahoo! Answer datasets from web and then process them using the 00_EDA.ipynb
thankyou
Do you have code for calculating memorization scores? Also, in the 'compute_if_attr' within CIFAR, where is the path specified in the filename variable?