Consider to Benchmark Bootleg?

hitercs commented 2 years ago

Hi,

Thanks for your great work! I saw that Bootleg has changed the architecture to be a bi-encoder. I am curious about that:

1) How the latest Bootleg model performs in comparison with the initial Bootleg model? 2) How Bootleg performs on popular benchmark datasets (e.g., datasets as used in GENRE.)?

Are you planning to add the official evaluation results of Bootleg on these benchmarks?

Thanks!

lorr1 commented 2 years ago

Hello,

We did change the architecture and training. This model is still in Beta mode at the moment, but on some preliminary results on AmbER is does comparable to the old model, especially over the tail. We also find this model performs much better on zero-short downstream tasks that rely on nearest neighbors of the entity embeddings - it's one of the main reasons we changed its architectures.

I don't have plans at the moment to run a full evaluation suite but am happy to provide support if it helps.

Thanks!

hitercs commented 2 years ago

Thanks!

HazyResearch / bootleg

Consider to Benchmark Bootleg? #102