Closed diff7 closed 3 years ago
No. We did try removing ~5% high-influential (high self influence, potentially mislabeled) data hoping to remove mislabeled data and it didn't hurt performance much (not cleaning the eval set). However, we were looking for an improvement.
Hi, got it!
Thank you for your reply!
Hi, thank you for a great work.
I was wondering. Have you tried using influence scores to select most influential data points to reduce training set?
Thank you!