Question on the fisher_matrix function

db434 / EWC

TensorFlow v2 + Keras implementation of Elastic Weight Consolidation

24 stars 4 forks source link

Question on the fisher_matrix function #1

Closed jimmykimmy68 closed 9 months ago

jimmykimmy68 commented 9 months ago

Hi, thanks for sharing a great implementation of EWC!

I have a question on the fisher information function (i.e., def fisher_matrix(model, dataset, samples):), as follows.

In the calculation of the fisher information, the output = model(data) would provide the softmax outputs. But then according to your implementation, the log-likelihood is simply calculated as the log of the softmax output. Doesn't the likelihood calculation require the true label though? I thought the likelihood p(y|x,\theta) calculation would require the true label information (i.e., 'labels' in the implementation).

Thank you

db434 commented 9 months ago

Hmm, good question. I confess that I haven't looked at this for a long time, and even when I did, I didn't look at the maths too deeply.

I think that the phrase "log-likelihood" in the code is wrong, but the method implemented is correct. We're interested in how sensitive each output is to changes in each weight, but we don't care what the actual training gradient would be.

I'm happy to be corrected though!

jimmykimmy68 commented 9 months ago

Hi, thanks for the reply.

I think the log-likelihood would require both the ground-truth label and the predicted probabilities (i.e., model(data)) for its calculation.

I will look into this and your implementation in more detail.

Thank you!

jimmykimmy68 commented 9 months ago

https://www.ii.pwr.edu.pl/~tomczak/PDF/[JMT]Fisher_inf.pdf

jimmykimmy68 commented 9 months ago

For any other researchers who might be interested in this issue, the calculation of the true Fisher requires the ground-truth labels. However, for computation efficiency, most of the existing implementations (including Daniel's repository) use the empirical Fisher, which disposes of the use of ground-truth labels.

Refer to; Chaudhry, Arslan, Puneet K. Dokania, Thalaiyasingam Ajanthan, and Philip HS Torr. "Riemannian walk for incremental learning: Understanding forgetting and intransigence." In Proceedings of the European Conference on Computer Vision (ECCV), pp. 532-547. 2018.