facebookresearch / meru

Code for the paper "Hyperbolic Image-Text Representations", Desai et al, ICML 2023
Other
137 stars 13 forks source link

Making entailment loss learnable? #7

Open ez2rok opened 1 year ago

ez2rok commented 1 year ago

Hello! Great work on this paper!

I was wondering if you at all considered making the entailment loss learnable, similar to the curvature or visual / textual alphas? What went into your decision of manually choosing the entailment loss?

Cheers!

ez2rok commented 1 year ago

To clarify, I was specifically referring to the entail_weight or λ parameter of the MERU model. However, I see that the authors did experiment with different λ parameters. To quote from the paper:

Some λ > 0 is necessary to induce partial order structure, however, quantitative performance is less sensitive to the choice of λ ∈ [0.01, 0.3]; Higher values of λ strongly regularize against the contrastive loss and hurt performance.

It seems that the authors did not require the model to learn λ / the entail_weight because λ > 0.3 generally hurt performance and λ <= 0.3 had a qualitative, not quantitate performance and thus would be difficult to learn.