ContrastToDivide / C2D

PyTorch implementation of "Contrast to Divide: self-supervised pre-training for learning with noisy labels"
MIT License
69 stars 13 forks source link

tabular data/ noisy instances #3

Open nazaretl opened 2 years ago

nazaretl commented 2 years ago

Hi, thanks for sharing your implementation. I have two questions about it:

  1. Does it also work on tabular data?
  2. Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

Randl commented 2 years ago

Self-supervision for tabular data is hard. If you manage to get one, C2D ought to work. As for identification of noisy instances, you can just try to run inference on training set: the samples with high loss are likely to be noisy.