fredzzhang / pvic

[ICCV'23] Official PyTorch implementation for paper "Exploring Predicate Visual Context in Detecting Human-Object Interactions"
BSD 3-Clause "New" or "Revised" License
69 stars 8 forks source link

hyperparameters #36

Closed 131413ljk closed 1 year ago

131413ljk commented 1 year ago

Hi, Great work. We found that the hyperparameters reported in your paper are slightly different from the settings in the code. Which settings should we follow for training?

fredzzhang commented 1 year ago

Hi @131413ljk,

Could you specify which hyper-parameters are you referring to?

Fred.

131413ljk commented 1 year ago

We take the settings in the code: box-score-thresh=0.05, λ=2.8. The hyperparameters reported in the paper are: box-score-thresh=0.2, λ=0.26(geometric mean)

fredzzhang commented 1 year ago

Ahh, the threshold on box scores was indeed changed. I forgot to update that in the paper. So you should stick to the settings in the code.

As for the value of $\lambda$, note that the formula in the paper uses normalised values, that is, the exponent on the box scores and the classification scores should sum up to one. In the code, however, the exponent on the classification scores is set to 1. And the exponent on the box scores is 2.8. These are unnormalised. After normalisation, you would get $\frac{1}{1+2.8}=0.26$. Because normalisation of the exponents does not change the ordering of examples, and therefore has no impact in the final performance, we did not normalise the exponents in the code. In short, the value $\lambda$ in the paper is the normalised exponent on the predicate classification scores, while the hyper-parameter you are setting in the code is the unnormalised exponent on the box scores.

Fred.

131413ljk commented 1 year ago

Thanks!