ignatovmg / mhc-adventures

MIT License
0 stars 0 forks source link

How dense must the sampling be to reach < 2.5A? #2

Open ignatovmg opened 5 years ago

ignatovmg commented 5 years ago

Use the sampler class from here to study how much samples we need to reach proper sampling density. Some useful functionality for RMSD calculation is here. Ideally we want 5% < 2.5 A heavy atom RMSD (all atom minus hydrogens) conformations in the dataset to ensure a proper training. Also study the resulting backbone RMSD, our purpose is around 1.2 backbone RMSD (GradDock result)

cubazis commented 5 years ago

I need your feedback, if I've understood issue of conformations calculations. It would be great to write it here

ignatovmg commented 5 years ago

Looks good, but several things: 1) Can you change "<= 9" to "== 9". We are interested in 2 subsets: a) all peptides b) 9mers only. 2) Can you change the threshold for bb rmsd to 1.2 instead of 2.5 3) Train set should contain 105 complexes, I saw only ~20. Why? 4) Its good to know average values, but from your tables sampling quality varies strongly from case to case. Is there any way to plot how the number of structures < 2.5 depends on the bulkness of amino acid side chains in the peptide? Separately for each length group, since th longer the peptide is, the worse the sampling is.

Also this analysis would better fit for issue #4. For this one I was thinking, that you could run sampling for the worst/medium peptide cases (more than 4000 samples) and see how many samples we need to start producing good conformations at all. There is a number of cases where we have zero conformers < 2.5, would be nice to fix that.

You can try to do the following: a) Produce > 4k samples and see if we start getting good conformers. b) Remove receptor during brikard sampling and see if it helps. I produced the whole dataset with receptor included and maybe this was a bad idea. Brikard filters out those conformations which result in clashes with surrounding atoms, therefore, if the peptide was initially placed not very well, it is possible that brikard just throws out good conformations because of that. We need to check if this is the case by removing the receptor during sampling.

cubazis commented 5 years ago
  1. Because I've used your fixed.csv. Your last tar contains ~20 only
ignatovmg commented 5 years ago

I didn't see that, i'll send you a correct set

cubazis commented 5 years ago
cubazis commented 5 years ago

Questions

cubazis commented 5 years ago

I didn't see that, i'll send you a correct set

image

Everything is okay, дядь?)

ignatovmg commented 5 years ago

Ahh didnt spot another error, fixing that..

ignatovmg commented 5 years ago

https://github.com/ignatovmg/mhc-adventures/issues/2#issuecomment-532428437 1) Those which currenty don't / almost don't have any near natives at all 2) lengh = 9 and length = any

cubazis commented 5 years ago

#2 (comment)

  1. Those which currenty don't / almost don't have any near natives at all
  2. lengh = 9 and length = any
  1. Got that
  2. What does near natives means
ignatovmg commented 5 years ago

#2 (comment)

  1. Those which currenty don't / almost don't have any near natives at all
  2. lengh = 9 and length = any
  1. Got that
  2. What does near natives means
  1. Native structure = crystal structure = the one from pdb = the answer Near native - the one close to the native structure