Bounding box normalizaton in preprocessing with load_mask

Yueying13 commented 9 months ago

Hi Jeff!

I was preprocessing a dataset containing mask and noticed that the bounding box labels I was getting were too small.

When I checked tools/preprocessing.py I found that when args.load_masks, the masks read will be resized to[int(s / cfg.DATASET.OUTPUT_SIZE[0]) for s in cfg.DATASET.INPUT_SIZE] L114, which in your config file is (192,128), however in L146 when normalising the bbox label you are using the camera['Nu'], camera['Nv'] in the camera parameters, i.e. the size of the image (1920, 1200), not the resized mask size. In this case the value of the bbox label will be very small making the subsequent calculation and prediction of the bbox inaccurate.

Could you please check if I understand the bbox label processing correctly? Because after I modify the bbox label normalisation I can get better bbox loss.

tpark94 commented 9 months ago

Hello,

Thank you for pointing it out. You are correct that it is an error.

The CSV files that I've had to run the experiments have correct bounding box labels. What probably happened is that I had created these CSV files using the original masks before I added the code bit that resizes and saves the images/masks. It seems I've overlooked this since nobody else had access to masks ;)

I'm going to upload the masks at some point this month and update the codes. I'll keep this issue open until then just in case. Thanks for catching the bug!

Jeff

Yueying13 commented 9 months ago

ok great! thank you :)

tpark94 commented 9 months ago

Hello,

I just updated the repo with a bug fix, and also updated the README with a link to the binary masks. Please give them a try and let me know if anything comes up.

tpark94 / spnv2

Bounding box normalizaton in preprocessing with load_mask #6