Rappsilber-Laboratory / AlphaLink2

AlphaLink2: Integrating crosslinking MS data into Uni-Fold-Multimer
Creative Commons Attribution 4.0 International
42 stars 11 forks source link

Cross link input #3

Open roivant-matts opened 1 year ago

roivant-matts commented 1 year ago

Hello, Thanks for this excellent project. For technical reproducibility, are you able to share the RpoA-RpoC cross-links csv or pkl dictionary? I wasn't able to find in the paper/supplements etc

lhatsk commented 1 year ago

The crosslinks are from this paper (Data Availability): https://www.embopress.org/doi/full/10.15252/msb.202311544

RPOA-RPOC.pkl.gz

roivant-matts commented 1 year ago

Thanks and I gather the CSV format is 1-based, but the dictionary is 0 based? I guess expected, but I needed to add FDR values to above to get the prediction.

lhatsk commented 1 year ago

Yes, CSV format is 1-based, and dictionary 0-based. Sorry, about the missing FDR, my internal set-up is a little different and has a fixed FDR.

roivant-matts commented 1 year ago

Thanks - I am rerunning to be sure I didn't have a mixup, but I found using the v2 params with and without cross-links (e.g. an empty {} pkl.gz) I get the same structure. (both match the reference PDB closely - e.g. superimposing al2 chain B (rpoc) on chain D of the reference). Is my approach to use an empty dictionary as a baseline appropriate in your view? edit: I am using FDR 0.20 on all the links you shared.

lhatsk commented 1 year ago

Yes, that works as a baseline. We used an FDR of 0.05 but it doesn't matter here.

We noticed the same thing last week. Increasing the crop size during fine-tuning seems to already improve the RpoA-RpoC prediction sometimes. In our runs, 5/10 failed without crosslinks, whereas 10/10 succeeded with crosslinks so similar to the other experiments they allow us to focus sampling on the interesting regions.

On the CASP data, it didn't seem to have a big effect (see extended data figure 3 in the v2 paper supplement).

roivant-matts commented 9 months ago

For the figure 3 data do you recall if the v2.2.4 or v2.3.0 weights are used for the alphafold predictions? I noticed in their release notes for v2.3.0 they also increased crop size to 640AA. Apologies for lag in coming back to this thread - some other testing brought it back to my mind.

lhatsk commented 8 months ago

Sorry for the late response!

For Figure 3 (the Cullin4 data) we switched to v2.3.0 for AlphaFold and AlphaLink because the other networks were not able to produce meaningful predictions. Essentially the structures were just floating in space (disconnected). v2.3.0 performs much better for larger complexes.