bwicky / oligomer_hallucination

MIT License
35 stars 5 forks source link

make_msa_features() got an unexpected keyword argument 'deletion_matrices' #1

Open tubiana opened 2 years ago

tubiana commented 2 years ago

Thank you so much for the release of your code =D

I'm trying to model a C12 oligomer of one of my protein and I followed your procedure to install oligomer_hallucination.

I got some issues with the conda environment installation but I managed to fix it easilly, but now I have an error with the alphafold.pipeline tools : TypeError: make_msa_features() got an unexpected keyword argument 'deletion_matrices' (i use as input your first example ./oligomer_hallucination.py --oligo AAA+ --L 100 --out example)

The version of Alphafold that I have is a recent one. Do you use the last Alphafold Version as well ? Also, Could it be linked to the MSA search (could it be possible to use a custom MSA?)

Thank you :) Thibault.

Full error message :

➜ ./oligomer_hallucination.py --oligo AAA+ --L 100 --out example
gpu
# Namespace(L='100', T_init=0.01, amber_relax=0, commit='', dssp_fractions_specified=None, exclude_AA='C', half_life=1000, loss=[['dual', []]], loss_weights=[1.0], model=4, msa_clusters=1, mutation_method='frequency_adjusted', mutation_rate='3-1', oligo='AAA+', oligo_weights=[1.0], out='example', output_pae=False, position_weights=None, proto_Ls=[100], proto_sequences=None, recycles=1, select_position_params=None, select_positions='random', seq=None, single_chain=False, steps=5000, template=None, template_alignment=None, timestamp=False, tolerance=None, unique_protomers=['A'])
> Git commit: 
> The following oligomers will be designed:
 >> AAA (positive design), contributing 1.0 to the global loss
> Simulated annealing will be performed over 5000 steps with a starting temperature of 0.01 and a half-life for the temperature decay of 1000 steps.
> The mutation rate at each step will go from 3 to 1 over 5000 steps (stepped linear decay).
> The choice of position to mutate at each step will be based on random, with parameter(s): None.
> At each step, selected positions will be mutated based on frequency_adjusted.
> Predictions will be performed with AlphaFold2 model_4_ptm, with recyling set to 1, and 1 MSA cluster(s).
> The loss function used during optimisation was set to: [['dual', []]], with respective weights: [1.0].
> Allowed amino acids: 19 [A R N D Q E G H I L K M F P S T W Y V]
> Excluded amino acids: 1 [C]
 >> Protomer A init sequence: DIWDRGEGIRPKMLMAGGAMKIDFLPGSIDKNLGIDRFEVPYSAKGVTRDRQGRTELASYPGLFKLKNNTAFMATPREGLADDTERLVLENPMGDHLNNV
 >> Protomer A position-specific weights: [0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
 0.01 0.01]
Setting up model:  model_4_ptm
----------------------------------------------------------------------------------------------------
Starting...
Traceback (most recent call last):
  File "./oligomer_hallucination.py", line 277, in <module>
    af2_prediction = predict_structure(oligo,
  File "/home/thibault/softwares/oligomer_hallucination/modules/af2_net.py", line 92, in predict_structure
    **pipeline.make_msa_features(msas=[[query_sequence]],
TypeError: make_msa_features() got an unexpected keyword argument 'deletion_matrices'
bwicky commented 1 year ago

Hi Thibault,

Are you trying to predict the structure of an oligomer for which you have the sequence, or design/hallucinate a C12 from scratch? If you're only trying to predict the structure, you can simply use the AF2.py script in scoring/

That being said, your error seems to be related to an alphafold version issue. The version we used had make_msa_features taking the deletion_matrices keyword, but that seems to have been removed in the latest version of alphafold (I just checked the github).

Try removing deletion_matrices=[[[0]*len(query_sequence)]] in modules/af2_net.py (at line 93) and let us know if that resolves your issue.

Thanks, Basile


Basile I. M. Wicky, PhD Postdoctoral Fellow, Baker Lab https://www.bakerlab.org/ Institute for Protein Design University of Washington, Seattle

On Fri, Sep 16, 2022 at 4:28 AM Thibault Tubiana @.***> wrote:

Thank you so much for the release of your code =D

I'm trying to model a C12 oligomer of one of my protein and I followed your procedure to install oligomer_hallucination.

I got some issues with the conda environment installation but I managed to fix it easilly, but now I have an error with the alphafold.pipeline tools : TypeError: make_msa_features() got an unexpected keyword argument 'deletion_matrices' (i use as input your first example ./oligomer_hallucination.py --oligo AAA+ --L 100 --out example)

The version of Alphafold that I have is a recent one. Do you use the last Alphafold Version as well ? Also, Could it be linked to the MSA search (could it be possible to use a custom MSA?)

Thank you :) Thibault.

Full error message :

➜ ./oligomer_hallucination.py --oligo AAA+ --L 100 --out example

gpu

Namespace(L='100', T_init=0.01, amber_relax=0, commit='', dssp_fractions_specified=None, exclude_AA='C', half_life=1000, loss=[['dual', []]], loss_weights=[1.0], model=4, msa_clusters=1, mutation_method='frequency_adjusted', mutation_rate='3-1', oligo='AAA+', oligo_weights=[1.0], out='example', output_pae=False, position_weights=None, proto_Ls=[100], proto_sequences=None, recycles=1, select_position_params=None, select_positions='random', seq=None, single_chain=False, steps=5000, template=None, template_alignment=None, timestamp=False, tolerance=None, unique_protomers=['A'])

Git commit:

The following oligomers will be designed:

AAA (positive design), contributing 1.0 to the global loss

Simulated annealing will be performed over 5000 steps with a starting temperature of 0.01 and a half-life for the temperature decay of 1000 steps.

The mutation rate at each step will go from 3 to 1 over 5000 steps (stepped linear decay).

The choice of position to mutate at each step will be based on random, with parameter(s): None.

At each step, selected positions will be mutated based on frequency_adjusted.

Predictions will be performed with AlphaFold2 model_4_ptm, with recyling set to 1, and 1 MSA cluster(s).

The loss function used during optimisation was set to: [['dual', []]], with respective weights: [1.0].

Allowed amino acids: 19 [A R N D Q E G H I L K M F P S T W Y V]

Excluded amino acids: 1 [C]

Protomer A init sequence: DIWDRGEGIRPKMLMAGGAMKIDFLPGSIDKNLGIDRFEVPYSAKGVTRDRQGRTELASYPGLFKLKNNTAFMATPREGLADDTERLVLENPMGDHLNNV

Protomer A position-specific weights: [0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

0.01 0.01]

Setting up model: model_4_ptm


Starting...

Traceback (most recent call last):

File "./oligomer_hallucination.py", line 277, in

af2_prediction = predict_structure(oligo,

File "/home/thibault/softwares/oligomer_hallucination/modules/af2_net.py", line 92, in predict_structure

**pipeline.make_msa_features(msas=[[query_sequence]],

TypeError: make_msa_features() got an unexpected keyword argument 'deletion_matrices'

— Reply to this email directly, view it on GitHub https://github.com/bwicky/oligomer_hallucination/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALWN2M6PZDZQKQEYLHUIUIDV6RKWLANCNFSM6AAAAAAQOHZUQ4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

bwicky commented 1 year ago

For full compatibility, maybe use the same AF2 commit as we used for this work: 1109480e6f38d71b3b265a4a25039e51e2343368

tubiana commented 1 year ago

Hi :-) Thank you for your answer =D I have a protein sequence and I'm trying to predict it's oligomeric form (C12) ! I will install the correct AF2 version and try again :-)

Thank you for your help =D Best, Thibault

bwicky commented 1 year ago

Hi Thibault,

Then the scoring/AF2.py is probably more what you want. Although keep in mind that this script does not take MSAs, and if you do have a natural protein sequence, predicting from a single sequence alone may not work. You could look at colabfold if that's the case.

Best, Basile

On Mon, Sep 26, 2022 at 1:54 AM Thibault Tubiana @.***> wrote:

Hi :-) Thank you for your answer =D I have a protein sequence and I'm trying to predict it's oligomeric form (C12) ! I will install the correct AF2 version and try again :-)

Thank you for your help =D Best, Thibault

— Reply to this email directly, view it on GitHub https://github.com/bwicky/oligomer_hallucination/issues/1#issuecomment-1257710805, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALWN2M4B2HJA7JCP5JR46HTWAFQE5ANCNFSM6AAAAAAQOHZUQ4 . You are receiving this because you commented.Message ID: @.***>

ryanbahi commented 1 year ago

Thibault,

Were you ever able to get the code working?

I debugged a bunch and eventually got to the point of your error (unexpected keyword argument 'deletion_matrices'). I then deleted the line per Basile's suggestion and that appears to have resolved that error.

Now I'm getting a new error: 'AttributeError: 'list' object has no attribute 'sequences.'

Do either you or Basile have any suggestions? Complete error message is included below.

Thanks!

Ryan

MOST RELEVANT OUTPUT: Starting... Traceback (most recent call last): File "oligomer_hallucination/oligomer_hallucination.py", line 280, in random_seed=np.random.randint(42)) # run AlphaFold2 prediction File "/content/tape/oligomer_hallucination/modules/af2_net.py", line 92, in predict_structure **pipeline.make_msa_features(msas=[[query_sequence]], File "/opt/conda/lib/python3.7/site-packages/alphafold/data/pipeline.py", line 65, in make_msa_features for sequence_index, sequence in enumerate(msa.sequences): AttributeError: 'list' object has no attribute 'sequences'

FULL OUTPUT: WARNING:jax._src.lib.xla_bridge:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.) cpu

Namespace(L=None, T_init=0.01, amber_relax=0, commit='', dssp_fractions_specified=None, exclude_AA='C', half_life=1000, loss=[['dual', []]], loss_weights=[1.0], model=4, msa_clusters=1, mutation_method='frequency_adjusted', mutation_rate='3-1', oligo='AA+', oligo_weights=[1.0], out='example', output_pae=False, position_weights=None, proto_Ls=[94], proto_sequences=['GDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARTKKEAEKFAAILIKVFAELGYNDINVTFDGDTVTVEGQLE'], recycles=1, select_position_params=None, select_positions='random', seq='GDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARTKKEAEKFAAILIKVFAELGYNDINVTFDGDTVTVEGQLE', single_chain=False, steps=5000, template=None, template_alignment=None, timestamp=False, tolerance=None, unique_protomers=['A'])

Git commit: The following oligomers will be designed:

AA (positive design), contributing 1.0 to the global loss Simulated annealing will be performed over 5000 steps with a starting temperature of 0.01 and a half-life for the temperature decay of 1000 steps. The mutation rate at each step will go from 3 to 1 over 5000 steps (stepped linear decay). The choice of position to mutate at each step will be based on random, with parameter(s): None. At each step, selected positions will be mutated based on frequency_adjusted. Predictions will be performed with AlphaFold2 model_4_ptm, with recyling set to 1, and 1 MSA cluster(s). The loss function used during optimisation was set to: [['dual', []]], with respective weights: [1.0]. Allowed amino acids: 19 [A R N D Q E G H I L K M F P S T W Y V] Excluded amino acids: 1 [C] Protomer A init sequence: GDIQVQVNIDDNGKNFDYTYTVTTESELQKVLNELMDYIKKQGAKRVRISITARTKKEAEKFAAILIKVFAELGYNDINVTFDGDTVTVEGQLE Protomer A position-specific weights: [0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383 0.0106383] Setting up model: model_4

Starting... Traceback (most recent call last): File "oligomer_hallucination/oligomer_hallucination.py", line 280, in random_seed=np.random.randint(42)) # run AlphaFold2 prediction File "/content/tape/oligomer_hallucination/modules/af2_net.py", line 92, in predict_structure **pipeline.make_msa_features(msas=[[query_sequence]], File "/opt/conda/lib/python3.7/site-packages/alphafold/data/pipeline.py", line 65, in make_msa_features for sequence_index, sequence in enumerate(msa.sequences): AttributeError: 'list' object has no attribute 'sequences'

ryanbahi commented 1 year ago

Basile,

Can I ask if you are still using this code in your own work or are you instead using other tools such as diffusion models, etc.?

I have been trying to figure out how to get this code working in a colab to try to create a dimer of an existing functional domain, but am having difficulty getting there. I'm wondering if the functionality of this software is incorporated into RFdiffusion or other software (or not).

Thanks much,

Ryan

bwicky commented 1 year ago

Hi Ryan,

In answer to your questions:

Best, Basile

On Sun, Mar 12, 2023 at 4:04 PM Ryan Bailey @.***> wrote:

Basile,

Can I ask if you are still using this code in your own work or are you instead using other tools such as diffusion models, etc.?

I have been trying to figure out how to get this code working in a colab to try to create a dimer of an existing functional domain, but am having difficulty getting there. I'm wondering if the functionality of this software is incorporated into RFdiffusion or other software (or not).

Thanks much,

Ryan

— Reply to this email directly, view it on GitHub https://github.com/bwicky/oligomer_hallucination/issues/1#issuecomment-1465323979, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALWN2M5MGLO4HKU4YJMW2CLW3ZI7DANCNFSM6AAAAAAQOHZUQ4 . You are receiving this because you commented.Message ID: @.***>