frederick0329 / TracIn

Implementation of Estimating Training Data Influence by Tracing Gradient Descent (NeurIPS 2020)
Apache License 2.0
215 stars 15 forks source link

Q: Applicability to sequence tagging #11

Open thangld201 opened 6 months ago

thangld201 commented 6 months ago

Hi @frederick0329, for sequence tagging (e.g. NER) one would need to predict label for each token in the sequence per a test sample. In this case, the loss is averaged across tokens and gradients of the last FFN can still be computed. I have two questions:

  1. Do you think TracIn computations in this case would require any significant change ?
  2. Would approximate nearest neighbor be usable in this case ?
  3. Would fast random approximation (Appendix F) be usable as well ?