Thanks for the excellent work, but I have some questions about the result of the forwardNet.
I think the result of the forwardNet should be batch_size channel output_res output_res, and we should use result[:,:17,:,:] for detection loss and result[:,17:34,:,:] for embedding loss.
However, in models/posenet.py line 52, the author uses dets = preds[:,:,:17] and tags = preds[:,:,17:34]
It seems like the result of forwardNet is output_resoutput_res*channel.
Way? Does this has anything to do with line 49 in posenet.py?
Thanks for the excellent work, but I have some questions about the result of the forwardNet. I think the result of the forwardNet should be batch_size channel output_res output_res, and we should use result[:,:17,:,:] for detection loss and result[:,17:34,:,:] for embedding loss. However, in models/posenet.py line 52, the author uses dets = preds[:,:,:17] and tags = preds[:,:,17:34] It seems like the result of forwardNet is output_resoutput_res*channel. Way? Does this has anything to do with line 49 in posenet.py?