Rappsilber-Laboratory / AlphaLink2

AlphaLink2: Integrating crosslinking MS data into Uni-Fold-Multimer
Creative Commons Attribution 4.0 International
42 stars 11 forks source link

Notebook AlphaLink2 producing corrupted output #6

Open ggerlach1 opened 12 months ago

ggerlach1 commented 12 months ago

I am excited to see the ColabFold notebook for this method! I tired running it with a cystine rich peptide that binds to larger protein with a single cross-link. The structure produced has a reasonable looking structure for the protein, but the smaller peptide appears to be chopped up into several small pieces. I say appears because the .zip file that is produced is corrupted so I cannot look at the results outside of the embedded viewer. Additionally, in the original AlphaLink there is a method to specify distances other than the one that it was trained on is this a feature that is possible in AlphaLink2?

Any help is greatly appreciated. Very useful method and the notebooks are a great resource.

lhatsk commented 12 months ago

Glad to hear! :-)

I'm curious, we have never tested AlphaLink on peptides, how well does AlphaFold-Multimer work in this case?

Unfortunately, I cannot reproduce the problem with the .zip file on Mac OS. You could try to replace the zipping with taring. Press "show code" in the last cell and replace the last lines with this:

!tar cvfz {target_id}.tar.gz {" ".join(file_lists)}
files.download(f'{target_id}.tar.gz')

Let me know if that works for you.

Regarding the distances, unfortunately, it's not possible at the moment. It is planned but I cannot give a time estimate. Do you have a specific distance in mind? We have another network trained on 10A (v2 weights) which we may be able to release in the meantime.

ggerlach1 commented 11 months ago

In my (limited) experience AlphaFold-Multimer works okay with peptides and specifically with the goal of generating a possible structure it can be helpful.

Thanks for the taring idea, the output is no longer corrupted.

The linker we have in this system is at max 11A between side chain nitrogens, so I think the 10A network would be much closer. If you can release that I would appreciate it.

lhatsk commented 11 months ago

I added the 10A network as an option to the ColabFold.

Since you were able to look at the peptide now, was it reasonable or somehow broken up?

ggerlach1 commented 11 months ago

Thank you! It was a reasonable output I believe there was an issue on my end with the fasta file (erroneous new line characters).

I look forward to trying the 10A network.

ggerlach1 commented 11 months ago

I ran the 10A network and unfortunately it produced the copped up output I had seen previously. This was with the same fasta and csv which had previous produced valid output, so I am confident it is not an issue with the input. Interestingly, this time the zip file was not corrupted, but the peptide chain has many residues too far apart and a collection of clashing atoms.

Do you have any thoughts on what could be causing this seemingly random occurrence?

lhatsk commented 11 months ago

Strange. Sorry, I have no idea. I have never come across this. Would be curious to see what happens if you relax it. Maybe it's worth running this with your output? https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/beta/relax_amber.ipynb Probably it just outright fails though.

If you are able to share your inputs (maybe confidentially by mail, I would be very curious).