Closed isalirezag closed 5 years ago
Thanks for your interest in our work.
Yes, the "batch size" of the output y
is the number of ROIs. I used quotes because the name of batch size is arbitrary. If we agree that the name of the outer dimension (leftmost in y.shape
) of a tensor is called batch size, we can remove the quotes 😉.
The code does not maintain the "batch size" of the input, because the implementation is generic enough to support different ROIs per slice on the "batch size" axis of the input.
In general, the output size is (num_rois, channels,) + output_size
where output_size
in the number of bins on height and width, respectively. You can see it on this line.
In your case, I would expect (4, 4, 5, 7)
. Please confirm that.
In case I didn't understand your question, please write code to replicate it.
It's faster to understand your question, or detect any bug.
In particular, it's unclear why you are expecting (2, 4, 4, 5, 7)
. Lemme know if that's your question.
No need to apologize 👍
Thank you for your code, I have a little confusion to understand the procedure. Sorry if my question is stupid. I am looking at the test case that you provided.
So based on my understanding the batch_size in output
y
is the number of rois? am I correct? if that is the case so what will happen to batch_sizes in the input. in other words, let say my input has shape oftorch.Size([2, 4, 12, 8])
and I have 4 rois, so the output will be in size oftorch.Size([4, 4, 5, 7])
, should the output be in size of torch.Size([2, 4, 4, 5, 7])?