XieResearchGroup / DISAE

MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization
Other
11 stars 4 forks source link

Question about create_tfrecords.sh #12

Open xfy9 opened 2 years ago

xfy9 commented 2 years ago

Hi, I'm sorry for asking you so many questions. I have create singlet_representatives,singlets,triplet_representatives,triplets. And when I run create_tfrecords.sh, I found that I havn't got directory triplets_wo_IDs. I really want to know directory triplets_wo_IDs store what kind file and file content. I want to know the way to generate the directory and the content in the directory. And I also find that some hmp_cluster_triplets files appear garbled problem, is that normal phenomenon? I will be very grateful if you answer my question

lxie21 commented 2 years ago

I am sorry that the data pre-processing code is not fully functional. I think that it is better for you to write the script by yourself following the procedure described in the paper.

Best, Lei

On Thu, Mar 3, 2022 at 7:12 AM xfy9 @.***> wrote:

Hi, I'm sorry for asking you so many questions. I have create singlet_representatives,singlets,triplet_representatives,triplets. And when I run create_tfrecords.sh, I found that I havn't got directory triplets_wo_IDs. I really want to know directory triplets_wo_IDs store what kind file and file content. I want to know the way to generate the directory and the content in the directory. And I also find that some hmp_cluster_triplets files appear garbled problem, is that normal phenomenon? I will be very grateful if you answer my question

— Reply to this email directly, view it on GitHub https://github.com/XieResearchGroup/DISAE/issues/12, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZSBCUTQW23AG24CIVCFW3U6CUB5ANCNFSM5P2JJ7VA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

thomasly commented 2 years ago

Triplets w/o IDs are simply the triplets clusters in your triplets folder without the ID column. It should be easy to write a script to generate them. For the garbled problem, could you specify what kind of garble do you get? If they are '999' or 'jjj' It is completely expected if they are '999' or 'jjj'. These characters are used to represent gaps.