speedinghzl / DSRG

Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing (CVPR 2018).
MIT License
250 stars 36 forks source link

Reproducing Paper Results #8

Open mcever opened 5 years ago

mcever commented 5 years ago

Hi,

I am trying to reproduce the 59.0 mIOU seen in the paper, but so far, all I can achieve is about 54.3 mIOU.

I wrote a script to convert all of the VOC SegmentationClass ground truth from their original RGB form to a black and white format where the pixel intensity corresponds to the alphabetically ordered VOC classes (e.g. 0 is background and 1 is aeroplane, 15 is person, etc).

After doing so and training, I am getting an mIOU of .543.

I then converted the SBD ground truths to the same 0-21 black and white png format and placed those images in my SegmentationClass folder. After re running run.sh, I then got .542 mIOU, very little difference. Perhaps this is the wrong way of including the SBD annotations, but I'm not sure how else I would include them. I suppose to train, I should only need image level labels from SBD, not the whole segmentation, and I may not even need that since it is likely included in the localization_cues-sal.pickle

Do I need to edit any list files or maybe place the SBD files in a different directory? Is there any other data augmentation you used on the VOC and/or SBD data?

If you're interested in how I did the format conversions, you can see the scripts here: https://github.com/mcever/Point-DSRG/tree/master/training/tools/data_prep

Any help you can provide would be greatly appreciated. I'm having a hard time figuring out why the mIOU didn't change much after augmenting with SBD, and I'm not sure why there's still a .05 gap between my results and your report. My best guess is that there is some data augmentation I should do to the JPEGImages, but I suppose it could have to do with SBD data if image level labels are fetched from outside the pickle file during training.

Thanks, Austin

speedinghzl commented 5 years ago

Only the images are needed in the training step. I can not figure out why you get such performance. It may be helpful if you can upload the training log file.

mcever commented 5 years ago

Does the code generate a specific training file, or should I just upload the output of run.sh?

Thanks again, Austin

speedinghzl commented 5 years ago

The output of run.sh.

mcever commented 5 years ago

You can find the output from run.sh here:

https://raw.githubusercontent.com/mcever/Point-DSRG/master/training/experiment/seed_mc/run-sh_out.txt

Please note that I added a few echo statements to run.sh that made it a bit easier for me to review the process.

speedinghzl commented 5 years ago

I have no idea after reading the log file. Can you evaluate the model generated by the first step (before retraining step)?

mcever commented 5 years ago

Thanks for working with me. You can see details of my evaluation after the first step here, and I have highlighted the most interesting line where I print the results:

https://github.com/mcever/Point-DSRG/blob/master/training/experiment/seed_mc/first-step-out.txt#L2789

speedinghzl commented 5 years ago

The result of step1 (before retaining) is significantly lower than the result reported in the paper. It's close to the result of the model without DSRG. Maybe you could remove DSRG layer to check it's performance. If the performance significantly drops, the problem maybe lies in the localization_cues-sal.pickle. You can try to redownload it.