openreview / openreview-expertise

Expertise modeling for the OpenReview matching system
MIT License
34 stars 4 forks source link

Use PyTorch for New Model Embeddings #181

Open haroldrubio opened 6 months ago

haroldrubio commented 6 months ago

This PR moves more shared functions into the Predictor class, avoids moving the embeddings from GPU to CPU each batch (speeds up each iteration), uses PyTorch's .save() and .load() to store embeddings more efficiently (takes up less disk)

melisabok commented 6 months ago

it seems this only works for specter2+scinclr, can we support it for specter2+mfr too?

haroldrubio commented 6 months ago

It looks like MFR is already using PyTorch serialization. The original SPECTER would be pretty tough to override since it looks like it re-uses some old code from a 4 year old branch of the allennlp library, the JSON predictions are built into it

carlosmondra commented 2 weeks ago

Do we still want to merge this?