aimerykong / OpenGAN

ICCV2021 - training a post-hoc lightweight GAN-discriminator for open-set recognition
119 stars 17 forks source link

Trained discriminator not working after netD.eval() or shuffle = False #10

Open 1214710638 opened 2 years ago

1214710638 commented 2 years ago

hello, thanks for open source your code. I notice that in demo_CrossDatasetOpenSet_testing.ipynb, you didn't set netD.eval() before testing, and the dataloader definition sets shuffle=True. I train and test the model follow your given demo. however, when i set netD.eval() or shuffle=True in dataloader, the test result is not good, the discriminator is not working at all. am i missing something or any suggestion for me?

aimerykong commented 2 years ago

Happy to help diagnose the issue. Can you send along two curves of AUROC at epochs: one with netD.eval() and the other without? The shuffle thing should not be crucial.

Shu

On Mon, Jul 4, 2022 at 12:09 AM 1214710638 @.***> wrote:

hello, thanks for open source your code. I notice that in demo_CrossDatasetOpenSet_testing.ipynb, you didn't set netD.eval() before testing, and the dataloader definition sets shuffle=True. I train and test the model follow your given demo. however, when i set netD.eval() or shuffle=True in dataloader, the test result is not good, the discriminator is not working at all. am i missing something or any suggestion for me?

— Reply to this email directly, view it on GitHub https://github.com/aimerykong/OpenGAN/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRJSJACWTG3FR3OGHJISMDVSJPXLANCNFSM52RZVGJA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

1214710638 commented 2 years ago

Thanks for your reply. it took me some times to get the two curves. Here they are: 企业微信截图_1656917892529 企业微信截图_16569179346353 As you can see, the discriminator is not working properly after netD.eval()

1214710638 commented 2 years ago

企业微信截图_16569180502202 企业微信截图_16569180758575 Also, there are curves for mean confidence scores, you can see the one with netD.eval() is unseparatable.

aimerykong commented 2 years ago

Thanks for the curves! I recall that I observed the same in my experiments, that's why I didn't use .eval(). I conjecture that this is because the discriminator is trained in the train mode for the best discriminative between outliers and closed-set; if turning on the eval mode, the discriminator can't work well. I can't remember if tried this or not -- set batch size as 1 and compare between the eval and train modes again. Would you like to try them out?

Shu

On Mon, Jul 4, 2022 at 3:04 AM 1214710638 @.***> wrote:

[image: 企业微信截图_16569180502202] https://user-images.githubusercontent.com/24508284/177099380-3d3a2193-2bdd-4aa7-90a7-638a8762df41.png [image: 企业微信截图_16569180758575] https://user-images.githubusercontent.com/24508284/177099450-6311852d-399d-4f37-894c-afb4d30356d5.png Also, there are curves for mean confidence scores, you can see the one with netD.eval() is unseparatable.

— Reply to this email directly, view it on GitHub https://github.com/aimerykong/OpenGAN/issues/10#issuecomment-1173429027, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRJSJA5VD3HESTGHLAOWG3VSKEGTANCNFSM52RZVGJA . You are receiving this because you commented.Message ID: @.***>

1214710638 commented 2 years ago

i checked more setups, and all of them didn't work under eval mode. i look close into your released test demo and i've discussed it with some other people. if you use train mode and batch to produce the test result, it might be kind of cheating. when you use train mode, it compute batch norm stats from the given batch, and in your setup, you use a batch of all open-set samples(or close-set samples). therefore, you took advantage of the label information or the test distribution indirectly to produce the test results which is an unfair comparison for other methods and could discredit your claims in the paper. you might took a closer look into this and might clarify this with more experiments.

aimerykong commented 2 years ago

Thanks for the discussion! This is a fair point and I agree that there can be information leakage in the testing implementation. That made me think about setting batch size as 1 when testing with the train mode. I'll try this later; I'd appreciate it if you can help try this by reusing your trained models. Shu

On Mon, Jul 4, 2022 at 9:40 PM 1214710638 @.***> wrote:

i checked more setups, and all of them didn't work under eval mode. i look close into your released test demo and i've discussed it with some other people. if you use train mode and batch to produce the test result, it might be kind of cheating. when you use train mode, it compute batch norm stats from the given batch, and in your setup, you use a batch of all open-set samples(or close-set samples). therefore, you took advantage of the label information or the test distribution indirectly to produce the test results which is an unfair comparison for other methods and could discredit your claims in the paper. you might took a closer look into this and might clarify this with more experiments.

— Reply to this email directly, view it on GitHub https://github.com/aimerykong/OpenGAN/issues/10#issuecomment-1174508505, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRJSJAOM2NN3WJ6M6FULM3VSOG7ZANCNFSM52RZVGJA . You are receiving this because you commented.Message ID: @.***>

hutudebug commented 2 years ago

In my experiments, even when turn on .train() mode in test phase, the performance still largely rely on batchsize, the order of test sequence, etc, which makes this idea hard to follow:(

libo-huang commented 1 year ago

In my experiments, even when turn on .train() mode in test phase, the performance still largely rely on batchsize, the order of test sequence, etc, which makes this idea hard to follow:(

When testing with eval mode I also get random results. That is netD.eval() in OpenGAN doesn't work for solving open-set problems. Can anyone help to check if we are wrong somewhere?