Open dstern opened 1 year ago
Do you have precomputed msa files at the target location for AlphaFold? If there are no files there, AF ignores the --use_precomputed_msas flag
Yes, all required msas in correct folder. af2 finishes and produces the structures. It is simply wasting most compute time searching HH databases for sequences that I know don't exist there. I would love to turn off this search and save some $$.
I've been struggling with using the use_precomputed_msas flag too. Despite me making the directory before running AlphaFold.
{$Fasta_name}/msas/{<.a3m file goes here>}
It seems to just bypass it and go on do run its own msa?
This is the first time I'm trying to run with pre-computed msas so I'm probably doing it wrong.
You need to provide multiple alignment files in the msas folder, with the correct names: bfd_uniclust_hits.a3m bfd_uniclust_hits.sto mgnify_hits.sto uniref90_hits.sto
I first make the a3m file and then reformat this file into the other three files.
I have attached a shell script that takes several inputs and produces a folder with all the required files in the correct places.
The inputs are 1 - the protein name (precisely as found in item 3) 2 - a gff file output from signalp, to allow removal of N-terminal signal peptides 3 - a fasta file that contains the protein sequence you want to model 4 - a fasta file that you want to use for phmmer search, to generate the msa
The script requires phmmer, seqkit, and mafft, and also uses the following scripts:
I hope this helps you.
Thank you so much, I'll update with how I get on.
I had assumed AlphaFold would take any .a3m input it found.
I use precomputed msas (--use_precomputed_msas) for my work and af2 currently spends most of its time performing HHSearch and HHblits, even though I know it won't find anything. Is there any flag to turn off these searches. This ends up wasting a lot of resources for my large searches.