adjusting code to fit a multi-label problem

amandalucasp commented 2 years ago

Hi! Thanks for sharing your work :)

I am trying to perform some tests using paws on a multi-label problem. Major changes I've have already implemented: (1) my dataset custom class and its corresponding "Trans" class as the "TransImageNet"; (2) implemented custom _make_data_transforms functions and _make_multicrop_data_transforms.

Now I'm working on adapting the ClassStratifiedSampler, and therefore also labels_matrix. I'm having trouble fully understanding how these two work together: is labels_matrix simply concatenating one-hot labels from the sampler (ClassStratifiedSampler) and smoothing it? Also, do you think it make sense to adapt ClassStratifiedSampler to a multi-label dataset, or should I just use a regular sampler (then I could do as mentioned in https://github.com/facebookresearch/suncet/issues/22#issuecomment-901921422)?

Thanks in advance for any tip!

MidoAssran commented 2 years ago

Hi @amandalucasp!

That sounds like a really interesting extension! All that sounds great; I'll answer the question below:

On ClassStratifiedSampler and labels_matrix:

The class ClassStratifiedSampler is used to sample the support mini-batch in each iteration. It first samples a set of classes, and then samples an equal number of images from each class (i.e., the common class-balanced sampling setup). This is important since a highly unbalanced support set would bias your nearest-neighbour classifier (i.e., you have a higher chance of predicting a certain class if most of the instances in the support are from that class).
labels_matrix just identifies which images in your mini-batch belong to the same class. Imagine you have 1 billion classes, but in each iteration, your support-set sampler only sub-samples 100 classes. Then labels_matrix can be a one-hot matrix with 100 classes, identifying which images in your mini-batch come from the same class. As an additional step, you can apply label-smoothing on this matrix, which is helpful for improving stability with mixed-precision training.

Honestly, ClassStratifiedSampler is a little complex, just because of how the ImageNet data is structured. If your data is not too imbalanced, you could try starting out with just use a regular sampler, and you can set labels_matrix to the concatenation of the one-hot labels form the sampler (I would still use smoothing here to improve stability though). If your labels_matrix is not one-hot though, then you may need to change the loss since a standard cross-entropy (i.e., multinomial loss) would no longer make sense here.

amandalucasp commented 2 years ago

Hi @MidoAssran! Thanks so much for the quick reply. I had to take a small break from the project due to health issues (that's why I'm kind of replying so late) but will get back to it soon. Your remarks will be key to my advances. Thanks again for the thorough response :)

amandalucasp commented 1 year ago

Hi again @MidoAssran 😄 I'm trying a few different samplers, and wanted to test your ClassStratifiedSampler as well, since my data is unbalanced. I simply adjusted my custom dataset's target_indices to fit my labels, that are all one-hot (for the pre-training stage, I'm experimenting with transforming my multi-label targets into multi-class ones); and adjusted the supervised_imgs_per_class from my yaml file to fit my most underrepresented class. Just wanted your feedback on this approach. Does that make sense to you? The code is apparently running fine.

https://github.com/facebookresearch/suncet/blob/731547d727b8c94d06c08a7848b4955de3a70cea/src/data_manager.py#L953

Thanks in advance!

MidoAssran commented 1 year ago

Hi @amandalucasp hope you're doing ok!

Yes that logic makes sense to me, but happy to take a look at any code if you'd like to verify!

All the ClassStratifiedSampler needs to do its job is the target_indices property in the dataset object passed to it, which, given a target (integer class label), will map to the indices in your dataset belonging to that target! Sounds to me like you've adjusted this already, so that sounds great. However, I would just look at the labels sampled by your custom ClassStratifiedSampler just to make sure it's working correctly.

amandalucasp commented 1 year ago

I'm fine now, thanks =)

Great! If it comes to that, will do :) thank you!

And yes, I made sure target_indices are integers. But for the rest of the code, I believe the labels are treated as one-hot, as in labels_matrix. Right? If I print the labels from

https://github.com/amandalucasp/dino_paws_submarine/blob/4f6872f386e50137334e89f4ec64272ca408f71b/suncet/src/paws_train.py#L320

This is what I get:

labels: tensor([[0.9100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100,
         0.0100],
        [0.0100, 0.9100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100,
         0.0100],
        [0.0100, 0.0100, 0.9100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100,
         0.0100],
        [0.0100, 0.0100, 0.0100, 0.9100, 0.0100, 0.0100, 0.0100, 0.0100, 0.0100,
         0.0100],
(...)

I believe it makes sense as I'm using label smoothing as you recommended.

mahsaep commented 1 year ago

Hi @amandalucasp , I am trying to extend PAWS to multi-label classification problem and saw here that you are working on the same problem. Have you managed to write a new stratified sampler? If so, It would appreciate if you could please share your code. cheers!

amandalucasp commented 1 year ago

Hi @mahsaep :) My dataset is highly imbalanced therefore a custom sampler would be a bit complex, so for now I'm trying a different approach. Since I'm still not done with these experiments (some are still running), I haven't got back to the idea of adjusting pre-training with paws for a multi-label dataset... but good luck! =)

Oh and btw, you should also keep in mind some remarks made in https://github.com/facebookresearch/suncet/issues/35#issuecomment-1157005128, that you might need to change the loss that's implemented.

facebookresearch / suncet

adjusting code to fit a multi-label problem #35