rs-station / rs-booster

Useful scripts for analyzing diffraction
https://rs-station.github.io/rs-booster/
MIT License
3 stars 4 forks source link

Align Isomorphous MTZs #32

Open kmdalton opened 1 year ago

kmdalton commented 1 year ago

Many space groups have degenerate choices of origins. Some have only a small number (ie 2) possible origins. Whereas others have a continuum of origins along one or more axis. These are called polar space groups. Given two isomorphous sets of structure factors with different origins, it is possible to align the resulting electron density maps by altering the phases to be consistent. It turns out this is equivalent to the image registration problem in computer vision which can be efficiently solved in fourier space (see: https://scikit-image.org/docs/stable/api/skimage.registration.html#skimage.registration.phase_cross_correlation). Other solutions to this problem rely on protein structures which is undesirable for some applications.

A detailed discussion of this is including some example implementations is in https://github.com/Hekstra-Lab/reciprocalspaceship/issues/31#issuecomment-1225061578. In the issue on rs, we decided this problem was out of scope for rs and should reside in rs-booster. The last two comments are the most relevant.

I think this project requires

My question is whether anyone wants to tackle this? I'm looking at you,

minhuanli commented 1 year ago

Sounds like something I can do. Similar to the translational search in molecular replacement, but with a smaller searching space (discrete or continuum)?

I saw the discussion in the rs issue. The function find_equivalent_origins works for both polar and non-polar case?

kmdalton commented 1 year ago

I think find_equivalent_origins is sorta stale. We want to go forward with an implementation that uses skimage.registration.phase_cross_correlation. In principle this will work for polar and nonpolar. However, in the nonpolar case it is probably faster and more accurate to serially check all the possible origins since there are never that many.

JBGreisman commented 1 year ago

Here's a template I had put together from before we identified that skimage.registration.phase_cross_correlation worked well: https://github.com/Hekstra-Lab/reciprocalspaceship/pull/174

I think the NonPolarTranslator case is pretty good for the brute force search of discrete cases, but the PolarTranslator should be changed to use the skimage rather than a global optimizer. Also, rs.utils.is_polar() can be used to determine which case to use.

JBGreisman commented 1 year ago

We determined that find_equivalent_origins wasn't quite necessary here because it was easier to just evaluate the 64 possible discrete cases for non-polar spacegroups. Technically, the phase_cross_correlation() method would work for non-polar cases too, but we can get better accuracy/precision from the known possible cases.

Reach out if you have any questions -- we did a fair amount of work on this in the past before deciding it was better suited for rsbooster

minhuanli commented 1 year ago

Got it. I can think about the phase_cross_correlation(). And it would be great if you know some examples about that.

One question, by 64 possible discrete case you mean the following two lines in your PR before?

cases = np.array([0.0, 1.0 / 3.0, 0.5, 2.0 / 3.0])
t_cases = np.vstack(np.meshgrid(cases, cases, cases)).reshape((3, -1)).T

Mostly for my own reference: I feel the find_equivalent_origins function might not be useful for this case, but should also exist as a function in rs.utils, like the rs.utils.polar_axes. And seems the implementation in the previous discussion is not correct. For P 31 2 1, the alternative origins should be [0,0,0] and [0,0,1/2] according to this ccp4 table. But that function gives:

array([[0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.33333333],
       [0.        , 0.        , 0.66666667]])
JBGreisman commented 1 year ago

yeah -- I recall finding some examples where the find_equivalent_origins did not match the ccp4 table. It might be useful, but we didn't have a use case so we didn't figure out the right way to call it.

If have a 3D reciprocal grid of complex structure factors, you can callphase_cross_correlation as:

phase_cross_correlation(complex1, complex2, space="fourier")[0]

you can then divide by your grid size to get the desired translation in fractional coordinates.