phbradley / alphafold_finetune

Python code for fine-tuning AlphaFold to perform protein-peptide binding predictions
Apache License 2.0
132 stars 17 forks source link

fine tuning peptide-MHC full model #8

Open croshong opened 1 year ago

croshong commented 1 year ago

Hi

I'm now trying to reproduce the fine tuning model for peptide MHC model

Ithe example in readme shows the following script command line

for fine tuning peptide-MHC full model

python run_finetuning.py \ --data_dir $ALPHAFOLD_DATA_DIR \ --binder_intercepts 0.80367635 --binder_intercepts 0.43373787 \ --freeze_binder \ --train_dataset datasets_alphafold_finetune/pmhc_finetune/combo_1and2_train.tsv \ --valid_dataset datasets_alphafold_finetune/pmhc_finetune/combo_1and2_valid.tsv

I'm wondering whether the above example is just the example which shows the script usage or I can use the

model from above script for real production purpose

It the above command line is just example, what kind of parameter should I change for real production purpose ?

Thanks

phbradley commented 1 year ago

Hi there, That looks like the production command line to me. For our parameter set, we stopped training after 2 epochs (2 times through all the training examples). Let me know if you run into any trouble. Take care, Phil


From: croshong @.> Sent: Tuesday, July 25, 2023 5:33 PM To: phbradley/alphafold_finetune @.> Cc: Subscribed @.***> Subject: [phbradley/alphafold_finetune] fine tuning peptide-MHC full model (Issue #8)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hi

I'm now trying to reproduce the fine tuning model for peptide MHC model

Ithe example in readme shows the following script command line

for fine tuning peptide-MHC full model

python run_finetuning.py --data_dir $ALPHAFOLD_DATA_DIR --binder_intercepts 0.80367635 --binder_intercepts 0.43373787 --freeze_binder --train_dataset datasets_alphafold_finetune/pmhc_finetune/combo_1and2_train.tsv --valid_dataset datasets_alphafold_finetune/pmhc_finetune/combo_1and2_valid.tsv

I'm wondering whether the above example is just the example which shows the script usage or I can use the

model from above script for real production purpose

It the above command line is just example, what kind of parameter should I change for real production purpose ?

Thanks

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/phbradley/alphafold_finetune/issues/8__;!!GuAItXPztq0!mLiP76pj90ZTXwxE_CbTFpYEYDy94M8PR8c1KzPXXZUXKcDgzvcxNiqW-UcPrMd6xH0T6Cjz6m8WTBYrLRpvWFvT$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABBNCH6R57DQDCRGSAONG3TXSBQWFANCNFSM6AAAAAA2X2BLAI__;!!GuAItXPztq0!mLiP76pj90ZTXwxE_CbTFpYEYDy94M8PR8c1KzPXXZUXKcDgzvcxNiqW-UcPrMd6xH0T6Cjz6m8WTBYrLQ7Kv1kp$. You are receiving this because you are subscribed to this thread.Message ID: @.***>

croshong commented 1 year ago

Thanks for your reply

I have successfully setup your code and alphafold library, several other library from deepmind

I'm now running above finetunning command, but it seems to take really very long time, more than several days

I started running 5 days ago, but it's till runnning in my server

My server has a GTX1080 GPU card which has a gpu computing capability around 6

what kind of gpu and server you are using for finetuing?

and Can you share the finetuning model generated with above command

which is used for PNAS publication?

Thanks

phbradley commented 1 year ago

Hi there, Can you tell how far the command has gotten, for example by looking at some of the log messages? It will run for 10 epochs by default, but we took the model after only 2 epochs.

As we noted on the README, the fine-tuned parameters are included in this download:

https://files.ipd.uw.edu/pub/alphafold_finetune_motmaen_pnas_2023/datasets_alphafold_finetune_v2_2023-02-20.tgz

Let me know if you run into any trouble. Take care, Phil


From: croshong @.> Sent: Sunday, July 30, 2023 12:36 AM To: phbradley/alphafold_finetune @.> Cc: Bradley PhD, Phil @.>; Comment @.> Subject: Re: [phbradley/alphafold_finetune] fine tuning peptide-MHC full model (Issue #8)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Thanks for your reply

I have successfully setup your code and alphafold library, several other library from deepmind

I'm now running above finetunning command, but it seems to take really very long time, more than several days

I started running 5 days ago, but it's till runnning in my server

My server has a GTX1080 GPU card which has a gpu computing capability around 6

what kind of gpu and server you are using for finetuing?

and Can you share the finetuning model generated with above command

which is used for PNAS publication?

Thanks

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/phbradley/alphafold_finetune/issues/8*issuecomment-1657067760__;Iw!!GuAItXPztq0!j9kIaMKXhizuYL3mXCyNOfcOPozi7NOqVNYcGBf5siuyNWruEG8aEeVXsGQlZvaX7m2eC26VU2l3IlU48-sKm7xA$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABBNCH777VONFMAIYB4I5O3XSYFIBANCNFSM6AAAAAA2X2BLAI__;!!GuAItXPztq0!j9kIaMKXhizuYL3mXCyNOfcOPozi7NOqVNYcGBf5siuyNWruEG8aEeVXsGQlZvaX7m2eC26VU2l3IlU489mTyxV1$. You are receiving this because you commented.Message ID: @.***>

croshong commented 1 year ago

attached files are log file for finetuning which is still currently running. it's still in training epoch 0

and I could find the finetuned parameter as you mentioned. but for the model_2_ptm, should I find it in the original alphafold package ?

croshong commented 1 year ago

alphafold_log I forgot the attachment

croshong commented 1 year ago

I think model_2_ptm means the file in params/parmas_model2_2_ptm.npz then what is the model_2_ptm_ft ? maybe should it be produced by finetuning?

croshong commented 1 year ago

I could run the prediction with finetuning paramter with following command line python run_prediction.py --targets examples/pmhc_hcv_polg_10mers/targets.tsv \ --outfile_prefix polg_test2 --model_names model_2_ptm_ft \ --model_params_files datasets_alphafold_finetune/params/mixed_mhc_pae_run6_af_mhc_params_20640.pkl \ --ignore_identities

and I got the polg_test1_final.tsv.xlsx table like attached excel file with this table how can I get the binding score for each peptide and MHC?