sokrypton / ColabFold

Making Protein folding accessible to all!
MIT License
1.89k stars 480 forks source link

Using AF-Multimer with custom MSA #76

Open jessica-andreani opened 2 years ago

jessica-andreani commented 2 years ago

Hello, we've been trying the updated AlphaFold2_mmseqs2 notebook hoping to be able to use AF-Multimer with a custom MSA. However setting msa_mode to custom seems to be reverting the pipeline to monomer modeling (and parameters). Looking at the way things are handled through queries_path and later get_queries, I guess it means this possibility has not been implemented (yet?) but I would be grateful for advice if we have overlooked something. Thanks for your help and many thanks again for the great contributions. Best, Jessica

martin-steinegger commented 2 years ago

This is currently not supported. We are discussing what kind of MSA format would work for this.

martin-steinegger commented 2 years ago

@jessica-andreani we now support MSAs for complex modeling using an annotated A3M file.

Our a3m starts with a header line marked by a #. The header consists of two lists separated by a tab. The first list contains the sequence length for each chain and the second its cardinality. After the header the A3M starts. It contains the paired an unpaired information for each chain sequence. Every chain occurs only once in the A3M even though it might be used multiple times in the complex prediction, this can be adjusted using the cardinality information.

Example 1: hetrodimer using paired and unpaired information

#12,10  1,1
>101    102
MGSSHHHHHHSQMTVVPPEGAI
>UniRef100_XXX1 UniRef100_XXX2
MGSAHHhhHHHHSQMTAAaPPEGAI
>101
MGSSHHHHHHSQ----------
>UniRef100_XXX1
MGSAHHhhHHHHSQ---------
>UniRef100_XXX1_2
MGAAHHhhHHHHSQ---------
>101
------------MTVVPPEGAI
>UniRef100_XXX2
------------MTAAaPPEGAI

Example2 : homodimer

#12 2
>101
MGSSHHHHHHSQ
>UniRef100_XXX1
MGSAHHhhHHHHSQ
>UniRef100_XXX1_2
MGAAHHhhHHHHSQ
Zkkkkkui commented 1 year ago

Error: list index out of range, always occurred when I tried the custom MSA neither on monomer nor complex prediction, which was generated by HHblits Toolkit server as instructed in the notebook. Could you maybe explain why and how should I modify the input MSA file?

AllisterCrow commented 1 year ago

I am struggling to get Colabfold to run with the custom MSA - including previous examples posted here.

I have tried offering colabfold the output a3m files from previous runs but these also produce errors.

Do you have an example custom msa in a3m format that is verified to work with recent versions of the colabfold that I could use to check my own custom msa?

All help much appreciated - and thank you for making this wonderful resource available to the community.

brunocuevas commented 8 months ago

Hi there! I was having a similar problem. I solved it when I used tabs to separate the two lists.

andrenalina1 commented 7 months ago

Hello I'm using colabfold, but I'm having a problem when I try to get a complex using an annotated A3M file. It is a model between an antibody and an antigen. This nanoantibody was discover in my lab and I only have the sequence (there is no crystallography data available for this and it is not present in any database). On the other hand, the antigen has a sequence and a crystal in a database. I am trying to create an msa using antibodies and antigens that we know interact on the same epitope. When we use only the sequences (without template or msa) we have the problem that the antibody and the antigen interact in a place that is not correct. We know the epitope of our antibody. We have the experimental data (from an specific immunoassay). I am attaching the file because there is a problem in the format. It is the first msa I made for colabfold, with this very particular format. I try many times but apparently I always do something wrong. I followed the instructions but this doesn't work.

formatseq_1.txt

jdmontenegro commented 4 months ago

Hello eveyone, I am curious if it is possible to produce a heterodimer a3m file starting with individual monomeric a3m files?

miarosenfeld commented 3 months ago

Hello eveyone, I am curious if it is possible to produce a heterodimer a3m file starting with individual monomeric a3m files?

hi there! i made a python script to do this. feel free to use it: https://github.com/miarosenfeld/utils/blob/main/combine_monomer_a3m.py