Open waleedillini opened 3 years ago
You've chosen to report an unexpected problem or bug. Unless you already know the root cause of it, please include details about it by filling the issue template. The following information is missing: "Your Environment";
"height" and "width" are meant to be the desired output shape during inference. They are not expected to affect training because model are designed to outputs losses during training (and therefore has no shape).
We'll update the doc to clarify that.
In that case, how is one supposed to train a network like this? In my case I'm giving input to the model in polar (range, phi) coordinates. After the backbone, I'm converting the features from a polar to a cartesian grid. This changes the height and width of the feature maps outputted by the backbone. My ground truth coordinates are with respect to this new shape of the cartesian grid. However, all the proposals learnt by the model during training are clipped to stay between the input image grid shape and this stops the model from learning anything.
From: Yuxin Wu @.> Sent: Monday, September 27, 2021 2:54:12 AM To: facebookresearch/detectron2 @.> Cc: Ahmed, Waleed @.>; Author @.> Subject: Re: [facebookresearch/detectron2] The "height" and "width" variables provided in the dataset_mapper function are not used during clipping of the proposal boxes. (#3526)
"height" and "width" are meant to be the desired output shape during inference. They are not expected to affect training because model are designed to outputs losses during training (and therefore has no shape).
We'll update the doc to clarify that.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/facebookresearch/detectron2/issues/3526*issuecomment-927620724__;Iw!!DZ3fjg!oU7fbo0nFY4mpFmPQK9twDTavmfd_Fq_azAlNvf9M0uCRQrZAhn2qtFT60NlVzl2Gs8$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AVZKL4LBRJPTHCOUYGRWDTDUEAPKJANCNFSM5EYIFZZQ__;!!DZ3fjg!oU7fbo0nFY4mpFmPQK9twDTavmfd_Fq_azAlNvf9M0uCRQrZAhn2qtFT60NlA54_r44$. Triage notifications on the go with GitHub Mobile for iOShttps://urldefense.com/v3/__https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675__;!!DZ3fjg!oU7fbo0nFY4mpFmPQK9twDTavmfd_Fq_azAlNvf9M0uCRQrZAhn2qtFT60NlNiD-S-Y$ or Androidhttps://urldefense.com/v3/__https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign*3Dnotification-email*26utm_medium*3Demail*26utm_source*3Dgithub__;JSUlJSU!!DZ3fjg!oU7fbo0nFY4mpFmPQK9twDTavmfd_Fq_azAlNvf9M0uCRQrZAhn2qtFT60NlM_60Abs$.
How are we supposed to get the Height and width of the bounding boxes Thanks In Advance
Issue (I already know the root cause and mention it here):
The "Input Format" section given here says that the "height" and "width" variables are used to specify the desired output shape in case it's desired to be different from the input image shape. Example: If my input image is of dimensions (100x100) and my desired output shape is (300x300), then obviously I will define my groundtruth box coordinates based on the (300,300) output shape. However, despite specifying these "height" and "width" variables, they are never used while generating bounding box proposals. In the "predict_proposals" function in rpn.py, only "images.image_sizes" is given as an argument which is used to clip the bounding box proposals. This means that the proposed bounding boxes are clipped to remain within the (100,100) range whereas my groundtruth boxes were defined on the (300x300) grid (desired output shape). I believe this is a genuine bug and the documentation on the given link is also wrong. I spent the entire week trying to debug what was wrong because according to your documentation, I was doing the right thing by specifying the desired output grid shape using the "height" and "width" variables in the dataset_mapper function.
Instructions To Reproduce the 🐛 Bug:
The code doesn't generate any errors and there aren't any significant changes to the code that need to be made. Just the scenario where your desired output shape is different from the input shape has this problem.