seunghyukcho / CoNAL-pytorch

Pytorch implementation of CoNAL(Common Noise Adaptation Layers) from "Learning from Crowds by Modeling Common Confusions"
6 stars 3 forks source link

Suggestions for Music dataset #1

Open zdchu opened 3 years ago

zdchu commented 3 years ago

Hi there,

Thanks for your interest in our paper! The poor performance on the Music dataset is mainly caused by the classifier. You can try to add two Batch Norm before the two linear layers.

seunghyukcho commented 3 years ago

Thanks for advices!

Do you mean adding two Batch Norm in Music classifier is this?

original code:

self.layers = nn.Sequential(
            nn.Linear(124, self.n_units),
            nn.BatchNorm1d(self.n_units, affine=False),
            nn.ReLU(),
            nn.Dropout(self.dropout),
            nn.Linear(self.n_units, self.n_class),
            nn.Softmax(dim=-1)
        )

fixed code:

self.layers = nn.Sequential(
            nn.BatchNorm1d(124, affine=False),  # added!
            nn.Linear(124, self.n_units),
            nn.BatchNorm1d(self.n_units, affine=False),
            nn.ReLU(),
            nn.Dropout(self.dropout),
            nn.Linear(self.n_units, self.n_class),
            nn.Softmax(dim=-1)
        )
zdchu commented 3 years ago

Yes! But I add the second BN after the ReLU and dropout. You can give it a try!

seunghyukcho commented 3 years ago

Thanks for more details. However, I've followed your advices but get accuracy near to 0.70. So I have some question for more details about your implementation.

zdchu commented 3 years ago

Yes, I set affine=False in BN. I got near 0.81 on the training set. The other settings are the same.

seunghyukcho commented 3 years ago

Now I achieved average accuracy 0.75 with similar train accuracy with yours. I'll try more techniques like weight initialization.

One more question is that did you used batch size 1024, learning rate 1e-2 for Music?

And did your Music training set also had 700 train data?

lazyloafer commented 2 years ago

Hi Chu, I'm reading your paper (Improve Learning from Crowds via Generative Augmentation in KDD'21). I want to find out whether the structure of the generator G in Fig. 1 is similar to that in CoNAL? In other words, how does G combine instance diffusion, annotator expertise, true labels of the instances, and random noise? @zdchu