XieResearchGroup / DISAE

MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization
Other
11 stars 4 forks source link

Some problem about CODE-AE #13

Open tb1over opened 1 year ago

tb1over commented 1 year ago

i am sorry to contact with your team in this repository. the issue of CODE-AE (https://github.com/XieResearchGroup/CODE-AE) may be closed. i have some problem about CODE-AE , and want to consult with your team.

  1. the stage of pretrian of CODE-AE, just use the cell lines gene expression, does not use the patients gene expression?
    in the pretrian_hyper_main.py, use the ccle_only=True pramater to get the cell lines data. Maybe I didn't understand it.
  2. the intermediate_results direction some data file in the intermediate_results direction, may be some results data, i just want to how to produce this result, they are come from your private code or just by CODE-AE.

thank you very much, Looking forward to your reply!

lxie21 commented 1 year ago

Dear Sir or Madam,

Thank you for your interest in CODE-AE.

For the first question, the training of CODE-AE has two stages. The first stage is unsupervised learning, it should include unlabeled data from both source domain (CCLE) and target domain (TCGA). The second stage is the supervised training, only labeled data from the source domain (CCLE) is used.

Best, Lei

On Tue, Dec 6, 2022 at 3:02 AM tb1over @.***> wrote:

i am sorry to contact with your team. the issue of CODE-AE ( https://github.com/XieResearchGroup/CODE-AE) may be closed. i have some problem about CODE-AE , and want to consult with your team.

  1. the stage of pretrian of CODE-AE, just use the cell lines gene expression, does not use the patients gene expression? in the pretrian_hyper_main.py, use the ccle_only=True pramater to get the cell lines data. Maybe I didn't understand it.
  2. the intermediate_results direction some data file in the intermediate_results direction, may be some results data, i just want to how to produce this result, they are come from your private code or just by CODE-AE.

thank you very much, Looking forward to your reply!

— Reply to this email directly, view it on GitHub https://github.com/XieResearchGroup/DISAE/issues/13, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZSBCS5RSZGJL4SHQD7CBTWL3XILANCNFSM6AAAAAASVGC2BI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

lxie21 commented 1 year ago
  1. The encoded_features https://github.com/XieResearchGroup/CODE-AE/tree/main/intermediate_results/encoded_features can be produced from https://github.com/XieResearchGroup/CODE-AE/blob/main/code/generate_encoded_features.py
  2. The plot_data https://github.com/XieResearchGroup/CODE-AE/tree/main/intermediate_results/plot_data folder has results of baseline comparison for both PDTC and TCGA tasks, the user needs to run those baseline models to get the results of the baselines. The result of CODEAE is in the main script, simply pass --pdtc value as True or False to switch between the two tasks.
  3. Some plots can be generated from this notebook https://github.com/XieResearchGroup/CODE-AE/blob/main/code/generate_plots.ipynb https://github.com/XieResearchGroup/CODE-AE/blob/main/code/generate_plots.ipynb,

Best, Lei

On Tue, Dec 6, 2022 at 3:02 AM tb1over @.***> wrote:

i am sorry to contact with your team. the issue of CODE-AE ( https://github.com/XieResearchGroup/CODE-AE) may be closed. i have some problem about CODE-AE , and want to consult with your team.

  1. the stage of pretrian of CODE-AE, just use the cell lines gene expression, does not use the patients gene expression? in the pretrian_hyper_main.py, use the ccle_only=True pramater to get the cell lines data. Maybe I didn't understand it.
  2. the intermediate_results direction some data file in the intermediate_results direction, may be some results data, i just want to how to produce this result, they are come from your private code or just by CODE-AE.

thank you very much, Looking forward to your reply!

— Reply to this email directly, view it on GitHub https://github.com/XieResearchGroup/DISAE/issues/13, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZSBCS5RSZGJL4SHQD7CBTWL3XILANCNFSM6AAAAAASVGC2BI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tb1over commented 1 year ago

Thank you very much for your detailed sharing

tb1over commented 1 year ago

Dear Sir or Madam, Thank you for your interest in CODE-AE. For the first question, the training of CODE-AE has two stages. The first stage is unsupervised learning, it should include unlabeled data from both source domain (CCLE) and target domain (TCGA). The second stage is the supervised training, only labeled data from the source domain (CCLE) is used. Best, Lei On Tue, Dec 6, 2022 at 3:02 AM tb1over @.> wrote: i am sorry to contact with your team. the issue of CODE-AE ( https://github.com/XieResearchGroup/CODE-AE) may be closed. i have some problem about CODE-AE , and want to consult with your team. 1. the stage of pretrian of CODE-AE, just use the cell lines gene expression, does not use the patients gene expression? in the pretrian_hyper_main.py, use the ccle_only=True pramater to get the cell lines data. Maybe I didn't understand it. 2. the intermediate_results direction some data file in the intermediate_results direction, may be some results data, i just want to how to produce this result, they are come from your private code or just by CODE-AE. thank you very much, Looking forward to your reply! — Reply to this email directly, view it on GitHub <#13>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZSBCS5RSZGJL4SHQD7CBTWL3XILANCNFSM6AAAAAASVGC2BI . You are receiving this because you are subscribed to this thread.Message ID: @.>

A nother questiong about the pretain stage of encoder. The parameter ccle_only = True in function data.get_unlabeled_dataloaders() of file pretrain_hyper_main.py and drug_ft_hyper_main.py means to use the ccle data only? According to your paper and answer it may be include the unlabeld ccle(source domain) and tissue(target domain) data, this makes me a little confused. thank you again.

lxie21 commented 1 year ago

When including another data set, set ccle_only = False.

Best, Lei

On Wed, Dec 7, 2022 at 1:47 AM tb1over @.***> wrote:

Dear Sir or Madam, Thank you for your interest in CODE-AE. For the first question, the training of CODE-AE has two stages. The first stage is unsupervised learning, it should include unlabeled data from both source domain (CCLE) and target domain (TCGA). The second stage is the supervised training, only labeled data from the source domain (CCLE) is used. Best, Lei … <#m3297639362447240674> On Tue, Dec 6, 2022 at 3:02 AM tb1over @.> wrote: i am sorry to contact with your team. the issue of CODE-AE ( https://github.com/XieResearchGroup/CODE-AE https://github.com/XieResearchGroup/CODE-AE) may be closed. i have some problem about CODE-AE , and want to consult with your team. 1. the stage of pretrian of CODE-AE, just use the cell lines gene expression, does not use the patients gene expression? in the pretrian_hyper_main.py, use the ccle_only=True pramater to get the cell lines data. Maybe I didn't understand it. 2. the intermediate_results direction some data file in the intermediate_results direction, may be some results data, i just want to how to produce this result, they are come from your private code or just by CODE-AE. thank you very much, Looking forward to your reply! — Reply to this email directly, view it on GitHub <#13 https://github.com/XieResearchGroup/DISAE/issues/13>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZSBCS5RSZGJL4SHQD7CBTWL3XILANCNFSM6AAAAAASVGC2BI https://github.com/notifications/unsubscribe-auth/ABZSBCS5RSZGJL4SHQD7CBTWL3XILANCNFSM6AAAAAASVGC2BI . You are receiving this because you are subscribed to this thread.Message ID: @.>

A nother questiong about the pretain stage of encoder. The parameter ccle_only = True in function data.get_unlabeled_dataloaders() of file pretrain_hyper_main.py and drug_ft_hyper_main.py means to use the ccle data only? According to your paper and answer it may be include the unlabeld ccle(source domain) and tissue(target domain) data, this makes me a little confused. thank you again.

— Reply to this email directly, view it on GitHub https://github.com/XieResearchGroup/DISAE/issues/13#issuecomment-1340472591, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZSBCVG3OEATFIYRYE5H6LWMAXGTANCNFSM6AAAAAASVGC2BI . You are receiving this because you commented.Message ID: @.***>