google-deepmind / alphafold

Open source code for AlphaFold.
Apache License 2.0
12.29k stars 2.2k forks source link

Running without MSA features #636

Open Gabriel-Ducrocq opened 1 year ago

Gabriel-Ducrocq commented 1 year ago

Hello,

I modified alphafold so that it takes my own custom template features. I would like to modify it again so that it runs without MSAs features. Is there any convenient way to do this ? Does setting all the MSAs features to 0 is equivalent to no MSAs features ?

Thank you, Gabriel.

cclough commented 1 year ago

@Gabriel-Ducrocq did you have any luck doing this? I'm also interested in running AF2 without the MSA step

smturzo commented 1 year ago

If you haven't figured this out already, then you can try this in this google colab by Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold: Making protein folding accessible to all. Nature Methods, 2022 (https://www.nature.com/articles/s41592-022-01488-1):

https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb#scrollTo=C2_sh2uAonJH

In the above link switch msa_mode to single sequence that should work. If this works then great. There are also other (slightly more complex) ways to hack AF2 to run without MSA. Let me know if you are interested in that.

busrasavas commented 1 year ago

Hi @smturzo I'm also trying to run AF2 without MSA locally. I'd appreciate if you could explain the other (slightly more complex) ways to hack, I'm very interested in that! Thanks in advance.

smturzo commented 1 year ago

According to my colleague, first you need to run a plain AF2 prediction. Then go here: sys_name/sys_name/msas/ for a monomeric system (sys_name/sys_name/msas/A/ for multimers) Then in the bfd_uniclust_hits.a3m

Delete all information within the .a3m file. Add a single fasta format sequence of your target protein.

Change the mgnify_hits.sto such that

it only has your protein sequence And it's corresponding #=GC RF reference line on the next line.

Then re-run the prediction with "--use_precomputed_msas" flag. It is important remember that if AF2 detects that .sto file format is not correct, then it will ignore precompute flag and regenerate the MSAs again. Therefore, keeping the .sto file format similar to before is important

busrasavas commented 1 year ago

@smturzo Thanks for the quick reply ! I'm not so familiar with the mgnify file format, since you gave me a warning, is it possible for you to explain a little more or share an example file content?

Zuricho commented 1 year ago

Actually, I recently wrote a scripte to create an empty MSA: https://github.com/Zuricho/ParaFold_dev/blob/main/parafold/create_fakemsa.py To use this, you can use the --use_precomputed_msas to use these empty MSA files

LasseMiddendorf commented 1 year ago

Hello, I tried to run AlphaFold with empty MSAs as described here. However, when I check the msa array in the feature.pkl file, it still has the original alignment. Does anybody have an idea what I might do wrong? Any help is very much appreciated!

Thanks, Lasse