sacdallago / bio_embeddings

Get protein embeddings from protein sequences
http://docs.bioembeddings.com
MIT License
463 stars 65 forks source link

ESM updates and versioning #103

Closed konstin closed 3 years ago

konstin commented 3 years ago

ESM has published a new model ESM-1b. We can't just replace the old model with the new model and change the protocol output, however we need some way to allow the user to "upgrade" to newer models/weights for such cases. Two apparent solutions are a new protocol esm_1b which has a shared implementation with esm or a field model: esm-1b which allows to select model weights. Drawback for the first is that we might get an unwieldy number of embed protocols and classes, while the second might break if the model e.g. changes the embedding dimensionality.

sacdallago commented 3 years ago

I would opt for a new protocol. Consistency is not always intelligent 😞

Anyway, yes. That model is real cool. Would be super cool to have it in!