YoshitakaMo / localcolabfold

ColabFold on your local PC
MIT License
622 stars 138 forks source link

colab_batch with local colab search multimer conformation #75

Closed drewaight closed 2 years ago

drewaight commented 2 years ago

Hello, I am trying to run colabfold_batch with amber relax and AlphaFold-multimer to generate antibody structures.

When I run regular colabfold (through the notebook) the correct structure is generated. (The vL and vH are paired as a heterodimer) However, I get an incorrect structure (vH and vL separated but connected...not paired as multimer) colabfold_structures.zip

I have compiled and installed mmseqs on my HPC and setup the databases according to the instructions here https://colabfold.mmseqs.com/.

I generate the .a3m files with the provided shell script and command ./colabfold_search.sh mmseqs "trastuzumab.fasta" "database/" "result_msa_dir/" "uniref30_2103_db" "" "colabfold_envdb_202108_db" "1" "0" "1" "80"

Where the trastuzumab.fasta input has the format >trastuzumab DIQMTQSPSSLSASVGDRVTITCRASQDVNTAVAWYQQKPGKAPKLLIYSASFLYSGVPSRFSGSRSGTDFTLTISSLQPEDFATYYCQQHYTTPPTFGQGTKVEIK:EVQLVESGGGLVQPGGSLRLSCAASGFNIKDTYIHWVRQAPGKGLEWVARIYPTNGYTRYADSVKGRFTISADTSKNTAYLQMNSLRAEDTAVYYCSRWGGDGFYAMDYWGQGTLVTVSS

I have installed localcolabfold on my HPC (with CUDA11.2.2 on my GPU cluster) and run

colabfold_batch --amber --templates result_msa_dir predictions

and the output is incorrect as I have described. I have attached the correct and incorrect structures output. I cannot discover what I am doing incorrectly in my local running of the program such that the output is not the correct multimer. Thanks so much for your help or any advice you can lend me to fix this problem!

Drew

nspyf commented 2 years ago

For multimer, you should use AlphaFold2-multimer

For inputfile, please try this csv format inputfile. (as I know, csv format file works)

id,sequence
trastuzumab,DIQMTQSPSSLSASVGDRVTITCRASQDVNTAVAWYQQKPGKAPKLLIYSASFLYSGVPSRFSGSRSGTDFTLTISSLQPEDFATYYCQQHYTTPPTFGQGTKVEIK:EVQLVESGGGLVQPGGSLRLSCAASGFNIKDTYIHWVRQAPGKGLEWVARIYPTNGYTRYADSVKGRFTISADTSKNTAYLQMNSLRAEDTAVYYCSRWGGDGFYAMDYWGQGTLVTVSS

And run it like this

colabfold_batch --amber --templates --num-recycle 3 --model-type AlphaFold2-multimer inputfile.csv outputdir/

(By the way, did you forget '--host-url' ?)