dials / dials

Diffraction Integration for Advanced Light Sources
https://dials.github.io
BSD 3-Clause "New" or "Revised" License
67 stars 47 forks source link

Adjust tolerances for sys abs check in dials.symmetry #2695

Closed dagewa closed 4 weeks ago

dagewa commented 1 month ago

Users with electron diffraction data sometimes find that dials.symmetry fails to identify screw axes, apparently because systematic absences are not exactly absent. Here is an example of a small molecule case where the correct space group is $P2_1 2_1 2_1$, but dials.symmetry keeps it in $P222$.

From the table it is clear that there is significant intensity in the "absences", however it is also clear that the intensity of absences is quite a bit lower than the present reflections.

+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
| Screw axis   |   Score |   No. present |   No. absent |   <I> present |   <I> absent |   <I/sig> present |   <I/sig> absent |
|--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------|
| 21a          |       0 |             3 |            3 |        33.797 |        8.859 |            35.244 |           10.699 |
| 21b          |       0 |             3 |            4 |       103.125 |       12.033 |           125.073 |           18.319 |
| 21c          |       0 |             0 |            0 |         0     |        0     |             0     |            0     |
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+

image

It seems it should be possible to separate the two classes but I've not found how to make dials.symmetry more tolerant of intensity in the absences. There is this significance_level = *0.95 0.975 0.99 choice, but none of them allow finding the correct space group in this case. I would like to find a way to tune the tolerance over a wider range.

jbeilstenedmands commented 1 month ago

Hi @dagewa , did you try the option systematic_absences.method=fourier? This was implemented by @kmdalton to try to help with these kinds of cases. Perhaps we should also change the significance_level to be settable to any floating point value between 0 and 1.

dagewa commented 1 month ago

Thanks, yes, I did try systematic_absences.method=fourier but I think this case is tricky because the intensity in the "absences" is quite high. I like the idea of having significance_level be a float in [0,1]. That would also help with another weirdness with the PHIL choice, which is that significance_level=.95 is not accepted, it has to be 0.95

dagewa commented 1 month ago

Investigating this: changing significance_level to take a float, then on the data in question, this command successfully found the two screw axes that were observed:

dials.symmetry integrated.expt integrated.refl systematic_absences.method=fourier significance_level=.65

So, I did have to go quite low with the significance_level, but it could be done successfully. 0.78 was low enough to find just the axis around $b$. I guess I need to think how low can you go for significance_level to still be significant... As a probability measure then 0.5 would be evens, right? So, a screw axis at 0.5 significance is just as likely to be present as absent.

kmdalton commented 1 month ago

@dagewa, the twofold axes seem very obvious in the Fourier power spectra. Do the method=fourier defaults predict the correct space group? If not, we should probably look into those settings too.

kmdalton commented 1 month ago

@dagewa, would you mind sharing the .expt and .refl file for this data set? I want to see if oversampling the Fourier transform helps to clean up the spectra.

dagewa commented 1 month ago

Hi @kmdalton, thanks, here are the integrated files: integrated.zip

kmdalton commented 1 month ago

@dagewa , I think oversampling does help screw axis detection. I made a PR to implement it ( #2701). Could you have a look when you get a chance?

Here's how the change looks on your data:

Analysing systematic absences

Laue group: P m m m
Performing systematic absence checks on unscaled profile-integrated data
Read 4690 predicted reflections
Selected 4506 profile integrated reflections
Removed 1840 reflections with d <= 0.96
Combined 80 partial reflections with other partial reflections
Laue group: P m m m
Scoring method: fourier
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
| Screw axis   |   Score |   No. present |   No. absent |   <I> present |   <I> absent |   <I/sig> present |   <I/sig> absent |
|--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------|
| 21a          |   0.99  |             3 |            3 |        33.797 |        8.859 |            35.244 |           10.699 |
| 21b          |   0.995 |             3 |            4 |       103.125 |       12.033 |           125.073 |           18.319 |
| 21c          |   0     |             0 |            0 |         0     |        0     |             0     |            0     |
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
+---------------+---------+
| Space group   |   score |
|---------------+---------|
| P 2 2 2       |  0      |
| P 2 2 21      |  0      |
| P 2 21 2      |  0.0101 |
| P 21 2 2      |  0.0046 |
| P 21 21 2     |  0.9852 |
| P 21 2 21     |  0      |
| P 2 21 21     |  0      |
| P 21 21 21    |  0      |
+---------------+---------+
Recommended space group: P 21 21 2
Saving reindexed experiments to symmetrized.expt in space group P 21 21 2
Saving 4690 reindexed reflections to symmetrized.refl

image

It doesn't pick up the third screw axis, but the _score_axis_fourier method is only called 2 times when I execute dials.symmetry. So, I think that is an upstream problem. Or maybe there is just not enough data along that axis.

dagewa commented 1 month ago

I think there's just no data along $c$, so I've got no issue with that