The warm-up strategy for bottom-up estimation

Zhengyang1995 commented 2 years ago

Hey, I notice that you mention that 'initial bottom-up estimates are not reliable', so you utilize warm-up strategy. I meet the same problem when I try to reproduce your work in pytorch. After several epochs the loss turns into 'nan'. Would you please let me know the certain codes in your work for this strategy? I try to find it but failed. And I will really appreciate that if you would please let me know if there is any possible solutions to avoid instable training for the bottom-up estimation.(I guess it is also the reason why you only use single-class image for this step?) Thank you so much!

js-fan commented 2 years ago

Hi, thank you for your interest! The warm-up strategy is implemented in the operator: https://github.com/js-fan/ICD/blob/f78286a6bd6939d031204028aa0506adbc290b71/core/model/layers_custom/icd.py#L134-L135 and called by: https://github.com/js-fan/ICD/blob/f78286a6bd6939d031204028aa0506adbc290b71/run_icd.py#L44 The core of this work is to exclude the disturbance of inter-class discrimination. Each intro-class discriminator only sees features belonging to the class it is responsible for. In other words, we should avoid asking it to discriminate features belonging to different foreground classes. This is why we update the bottom-up stage by only single-class images. Another possible way may be to exclude other classes' fg features by some masks, which may be derived from other classes' intra-class discriminators or the final estimations. But this way makes the pipeline too complicated and may cause another chicken or egg problem. Good luck!

Zhengyang1995 commented 2 years ago

Thank you so much for your work and help! It helps a lot!

js-fan / ICD

The warm-up strategy for bottom-up estimation #10