Open stephaniewankowicz opened 1 year ago
I test removing low density conformers and putting a qp test in angle. Both removed almost all of the 'too many conformers, reverting'. However, the qp in angle kinda blew up the R-free while the remove conformer even if one atom is below a certain threshold tended to increase the number of residues where we could not find a solution. I test this with removing if 1, 2, or 3 atoms are below the cutoff value. All of these removed the angle issue but increased the number of residues that we could not find a good conformer. I am going to try with 4 or 5 atoms as this should still eliminate many aromatic conformations.
OK, perhaps this comment is a little broader than this specific issue --- lmk if I should create a new issue to track it?
Problem: overfitting / trying to fit against an "oversampled" model I've seen a bunch of "too many conformers" in which there are over 2000 conformers (and sometimes over 10000!) In these circumstances, we know in advance that we will be trying to find a best fit for 2000 conformers to ~1500 voxels (or so).
To my ears: that's at best an overfit QP solution (more parameters than datapoints), at worst an unsolvable QP. As you highlighted in #378, and as you're trying to address here in this issue (#346) the change to the angle sampling will make this yet more common.
Suggestions
Too many conformers generated (29720). Reverting to a previous iteration of degrees of freedom: item 0. n_coords: [29720]
That's ... not really reverting, tbh. Is this another bug?
This feels like the sampling code might need a pretty deep rewrite, but I think it would be good to have this in place if you're gonna get backbone sampling working? (I'm excited for this!)
Currently, in qFit, removing conformers below a certain density level is defined as: if any voxel in the conformed has a density intensity <0.3 e−1 Å−3, the conformed will be removed.
This is default turned off.
We should include something like this, but it should look at if somewhere around >=5 atoms lack support (or somewhere around there).
This should be tested with different atomic cutoff values