facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.26k stars 643 forks source link

Mapping between MGYP ID and Uniprot ID #357

Closed blackjack6666 closed 2 years ago

blackjack6666 commented 2 years ago

Hello,

First of all I want to say thanks for the fantastic work! And I just wonder if you have provided any mapping between the MGYP ID and Uniprot ID? I wanted to download some structures based on Uniprot ID from ESM atlas, and it seems I could only search for structures based on MGYP ID?

Thank you!

ebetica commented 2 years ago

There's no such mapping afaik, Mgnify and Uniprot are two completely different sequence databases.

tomsercu commented 2 years ago

Just note that the best you could do probably is sequence search from either one of the dtabases as source and either one is target if you need an approximate nearest neighbor. you could look at mmseqs search --max-accept 3