lorenmt / reco

The implementation of "Bootstrapping Semantic Segmentation with Regional Contrast" [ICLR 2022].
https://shikun.io/projects/regional-contrast
Other
162 stars 25 forks source link

About the performance of training #8

Closed swt199211 closed 2 years ago

swt199211 commented 2 years ago

I'm sorry to bother you. I see that your code works very well, so I want to try to run it. However, when I perform reco and perform semi supervision on 600 tags, the result is only 0.6719. Due to the problem of GPU memory, I changed the batchsize from 10 to 6, and I haven't changed other parameters. Why is the result gap so large? I can't figure out the reason now. I'd like to ask you for advice. Thank you very much

lorenmt commented 2 years ago

Hello, changing batch size from 10 to 6 is basically reducing 40% of training data, which should decrease the performance by a large margin.

What I would suggest is keeping the original batch size, but instead change the query size --num_query to 128 or 64. That would reduce the GPU memory quite a lot and still maintain a decent performance.

Let me know whether this helps.

swt199211 commented 2 years ago

Thank you for your reply. I'll try it right away and feed back the results to you

swt199211 commented 2 years ago

Thank you again for your reply. I have run out of the same results as in your paper. I am now trying to change your semi supervised method to a binary infrared small target semantic segmentation network. Only the detection of target and background, the infrared target is very small, and some are just a few pixels. Do you think your network is also applicable? Or what should I pay attention to? I have no experience. I have just done this direction in the first grade of graduate student, and no one in the group has done this direction. If you can reply to me, I will be very grateful.

lorenmt commented 2 years ago

I think it should work, as long as you have some accurate training images.

Since DeepLabv3+ would down-sample images by 4 times, it may sometimes completely ignore some small objects.

So you need to make sure that even after down-sample 4 times, the small objects are still detectable. Otherwise, you need to pre-process your images to make them easier to train.

swt199211 commented 2 years ago

Thank you again for your reply. The network I use is dedicated to small infrared targets. I have now changed your semi supervision to this network, but I have trained that the recoloss has always maintained the initial size and has not decreased. What is the reason for this? My current level is not very good, and I haven't found the problem. I see that your original training recoloss will be reduced by about 2

swt199211 commented 2 years ago

By the way, my learning rate is set to 0.05, because the learning rate of the network corresponding to the original infrared dataset is 0.05, which is far from 0.0025 of your reco network. Will it be a problem with the learning rate. In addition, I set the supervised loss as IOU loss, because the original network uses IOU loss instead of the cross entropy loss in your code. The rest are the same. Now I can't judge where the problem is.

lorenmt commented 2 years ago

Yes, I wouldn't change the learning rate, because the network used here is pre-trained, so having a small learning rate is essential.

Changing cross-entropy to IOU loss wouldn't become an issue. I think you just need to do more trial-and-error, in order to find the problem. For the start, you should at least make sure supervised learning on labeled data would give a reasonable performance.

swt199211 commented 2 years ago

Thank you.Yes, I can confirm that the IOU can reach about 0.8 on the supervised data. Because it is a secondary classification, I think of a question. Is the weak_threshold of 0.7 set too low?

swt199211 commented 2 years ago

The recoloss will decrease significantly in the first epoch, but in the second epoch, the recoloss will rise to the initial value, and then basically stabilize at 0.62, it will not decrease any more. It feels strange

swt199211 commented 2 years ago

The recoloss will decrease significantly in the first epoch, but in the second epoch, the recoloss will rise to the initial value, and then basically stabilize at 0.62, it will not decrease any more. It feels strange

lorenmt commented 2 years ago

If you just doing binary segmentation, I don't think reco loss would help that much...it works better when having more classes to help learn a smoother decision boundary.

swt199211 commented 2 years ago

You mean that comparative learning is not suitable for binary classification, is it difficult to find a semi supervised method suitable for binary classificatio?

lorenmt commented 2 years ago

You can try ReCo first, if it doesn't work well, then we can think of something else.

swt199211 commented 2 years ago

OK, there is only one reduction in reco loss and sup loss, and in most cases, recoloss will not be reduced, that is, both losses will not be reduced at the same time. Thank you again for your reply

swt199211 commented 2 years ago

I have a question. For reco, you find the location of the corresponding feature layer from pixels, but how can you make sure that the network has learned this location correspondence? If the number of points queried is small

lorenmt commented 2 years ago

I don't understand this question. Could you provide more details?>

swt199211 commented 2 years ago

Hello, let me rephrase my question. At the end of the network, you lead to a branch of feature representation, namely semantic segmentation and feature representation. The way you use comparative learning to find features is judged by the results of semantic segmentation branches, and then you take the feature vector at the corresponding position in the feature representation branch. But is this correspondence reasonable? Why can the network learn this correspondence? If the network does not learn this correspondence, it will affect the final result?

swt199211 commented 2 years ago

Sorry to bother you again. Do you understand my question? Maybe I just learned deep learning and didn't understand a lot. The questions asked may be strange. I'm sorry

lorenmt commented 2 years ago

I don't understand why you think the correspondence between feature and label branch is not consistent, they are in the same spatial size and designed to have pixel-to-pixel correspondence.

swt199211 commented 2 years ago

Sorry, I know now. The location is indeed corresponding. Thank you for your reply