tjiiv-cprg / EPro-PnP-v2

[TPAMI 2024] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
https://arxiv.org/abs/2303.12787
MIT License
129 stars 8 forks source link

Could a new type of loss be introduced for classes? #10

Open harsanyidani opened 2 months ago

harsanyidani commented 2 months ago

In EPro-PnP-Det_v2 if we want to improve the classification performance, could theoretically a new type of loss be introduced with the help of the deformable correspondance head?

I was thinking about how the yaw angle distribution corresponds to different classes. During the AMIS algorithm we could use the generated rotation distribution, evaluate it from 0 to 2pi with some density. Then feed this distribution to a simple network which classifies based on yaw angle. Maybe this isn't suitable for all classes but it might be useful to train a binary classsifier for pedestrians and cones (which can be mixed for classifiers that are based on purely image inputs) and add the scores to the corresponding ones in the FCOS detection head with some weighting.

Or we could just use these orient logprobs for this purpose?: https://github.com/tjiiv-cprg/EPro-PnP-v2/blob/85215de8002dbd8523ee8eaaf1bae85b47179ebe/EPro-PnP-Det_v2/epropnp_det/models/dense_heads/deform_pnp_head.py#L563-L574

This is just an idea and my question is, could this theoretically work? Can this be backpropagated at all?

Thanks in advance for the answer, and for the previous ones too, they've been very useful.

harsanyidani commented 2 months ago

I was experimenting with this and I have a question in connection with this. https://github.com/tjiiv-cprg/EPro-PnP-v2/blob/85215de8002dbd8523ee8eaaf1bae85b47179ebe/EPro-PnP-Det_v2/epropnp_det/models/dense_heads/deform_pnp_head.py#L573 Here, we don't exactly get logprobs, but a constant is added to each probability based on the density of the distribution (+ np.log(orient_bins / (2 * np.pi)), so the probabilities won't sum to one. What is the purpose of this? Better visualization/numerical stability? Thanks in advance!

Lakonik commented 2 months ago

Theoretically these logprobs can be backpropagated. But I don't think information extracted from the pose distribution can help improve the classification.

+ np.log(orient_bins / (2 * np.pi)) is meant to convert the bin probabilities into a continuous density function on [0, 2pi), such that the integral equals 1. This is mainly for visualization purpose.

harsanyidani commented 2 months ago

Theoretically these logprobs can be backpropagated. But I don't think information extracted from the pose distribution can help improve the classification.

I understand. But theoretically, the yaw angle distribution evaluated with some density from 0 to 2pi would roughly look like the logprobs here? https://github.com/tjiiv-cprg/EPro-PnP-v2/blob/85215de8002dbd8523ee8eaaf1bae85b47179ebe/EPro-PnP-Det_v2/epropnp_det/models/dense_heads/deform_pnp_head.py#L573

np.log(orient_bins / (2 * np.pi)) is meant to convert the bin probabilities into a continuous density function on [0, 2pi), such that the integral equals 1. This is mainly for visualization purpose.

I don't understand this. Without np.log(orient_bins / (2 * np.pi)), the probabililities sum to one, with it they don't.