XieResearchGroup / DISAE

MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization
Other
11 stars 4 forks source link

Question about how can i generate vector representations of proteins #10

Open xfy9 opened 2 years ago

xfy9 commented 2 years ago

Hi, I have a csv file that stores protein sequences,how can i generate vector representations of proteins,I would be very grateful if you could help me with my doubts

lxie21 commented 2 years ago

Hi,

You need to write a program following the steps below:

  1. Generate a multiple sequence alignment for your sequence of interest.
  2. Compute the conservation score for each position.
  3. Extract ~200 most conserved positions.
  4. Generate triplets for extracted positions.

Detailed procedure is in the published DISAE paper.

Best, Lei

On Sat, Feb 26, 2022 at 1:57 AM xfy9 @.***> wrote:

Hi, I have a csv file that stores protein sequences,how can i generate vector representations of proteins,I would be very grateful if you could help me with my doubts

— Reply to this email directly, view it on GitHub https://github.com/XieResearchGroup/DISAE/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZSBCVH3XJU4DCP4OCFTJDU5B2UNANCNFSM5PMGCLTA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>