Open xnought opened 1 month ago
I made an interface too
https://github.com/user-attachments/assets/2609b63f-66fe-4643-8220-d5fc0459e487
If you can help me with the text part I mentioned above, I can add natural language queries to the website.
update: https://ocular.cc.gatech.edu/DS569k/ deployed it in case anyone wanted to use it
Also made a Nomic 2d map of 250k proteins using protein clip + topic modeled based on function if you were interested https://atlas.nomic.ai/data/donnybertucci/swissprot-proteinclip/map
Also made a Nomic 2d map of 250k proteins using protein clip + topic modeled based on function if you were interested https://atlas.nomic.ai/data/donnybertucci/swissprot-proteinclip/map
Which version of proteinclip did you use for this?
The smallest one. ESM2 6 layer one.
If you want the data, you can also download it here https://huggingface.co/datasets/donnyb/DS569k. I precomputed all the embeddings and other metadata just so I can reuse it later
Thanks! Really fun to just look around the different regions of the 2d map
from the paper I see that you first embed text with text-embedding-3-large, then you use your trained projection network from the contrastive learning.
Can you also release the pretrained text project models?
I want to embed text in the joint embedding space and find similar proteins that way.
Any help would be appreciated! Thank you very much. This is a super cool project!