TRAIS-Lab / dattri

`dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.
https://trais-lab.github.io/dattri/
MIT License
30 stars 9 forks source link

Minor issue with precompute_data_ratio in Arnoldi implementation #136

Closed charles-pyj closed 1 month ago

charles-pyj commented 1 month ago

In line 412 of influence_function.py: # Assuming that full_train_dataloader has only one batch iter_number = math.ceil(len(full_train_dataloader) * self.precompute_data_ratio) If full_train_loader has one batch the length returns one and thus no matter what ratio is provided full data will be precomputed. Not sure if this is intended.

jiaqima commented 1 month ago

Thanks for catching it. That is an outdated comment that should be removed

charles-pyj commented 1 month ago

Got it! If there are multiple batches it makes sense.