texttron / tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.
http://tevatron.ai
Apache License 2.0
524 stars 100 forks source link

InfoLOOB Loss #42

Closed raunak-agarwal closed 1 year ago

raunak-agarwal commented 2 years ago

Hi, Thanks for the great work!

Do you think InfoLOOB (formulation here, implementation here) would be a good addition to this library? Seems like it outperforms InfoNCE in an image-text setting; I thought it might be worth experimenting with it on purely-text-based IR tasks

luyug commented 2 years ago

Thanks for your suggestion! This looks interesting to me! I think maybe it is time for me and my colleagues to think about incorporating adding additional forms of loss into Tevatron.

In terms of development, I think we will survey a collection of interesting losses and add them together in a single PR. We are open to suggestions of other loss functions to include.

raunak-agarwal commented 2 years ago

AFAIK, the latent space of CLOOB seems to be aligning text and image modalities much better than CLIP. Below are two plots i saw someone post on EleutherAI's discord where they created UMAP's on a small sample of image-text pairs (CLIP on top and CLOOB below)

CLIP on top and CLOOB below

Let me know if integrating this is in the works. It would be a great addition to the library. I can also ping here if I come across other interesting losses.

luyug commented 2 years ago

One question, do you have any expectation on what this loss will do to text (text only setup)?

raunak-agarwal commented 2 years ago

My expectation is that in case of two tower setups, we might see better aligned embeddings. (I don't think this approach is meant for single tower setups)

Other than that, it's hard to say beforehand how much of an improvement we can expect.

luyug commented 2 years ago

I see. We will triage this through the weekend.