xia2 / screen19

Screening program for small-molecule single-crystal X-ray diffraction data
https://pypi.org/project/screen19/
BSD 3-Clause "New" or "Revised" License
2 stars 3 forks source link

Fix screen19 for few images #29

Closed benjaminhwilliams closed 4 years ago

benjaminhwilliams commented 4 years ago

Currently, if screen19 is called on a sweep of fewer than ten images, it runs in fast_mode, which bypasses profile modelling (and also dials.report). However, this is a poor heuristic for determining that profile modelling will fail. In fact, it is usually possible to create an adequate profile model from a single image, as demonstrated below. In any case, we can default to using summation intensities for the Wilson plot analysis if we need to.

This PR also introduces French-Wilson scaling of the integrated intensities, to boost the number of valid reflections. Currently, reflections with negative intensity are simply discarded. This is probably because I wrote this code before I knew anything much about crystallography and had never heard of French-Wilson scaling. It seems much more sensible to use it.

Profile modelling on single images with DIALS

To demonstrate that dials.create_profile_model can produce a workable profile model from even just a single image, I ran the following minimal processing routine on each image from the x4wide test data set:

dials.import <image> &&
  dials.find_spots imported.expt &&
  dials.index imported.expt strong.refl &&
  dials.create_profile_model indexed.*

I also ran the same routine on the entire sweep together, to get the 'true' profile model for comparison. The results are shown in this plot, with each datum corresponding to a single image, and dashed lines to the full-sweep values.

figure

Unsurprisingly, the beam divergence is well determined from a single image but the mosaicity determination is poor. Not as poor as one might expect from a single image though, and the error is systematic — the precision is fairly good, even if the accuracy is not. In any case, since profile-fitted integration will fail for a single image, we will default to summed intensities anyway and it doesn't matter that the estimate of mosaicity is a poor reflection of the true value.

graeme-winter commented 4 years ago

<- 👀 now

benjaminhwilliams commented 4 years ago

Looks sensible, think this is a good change set, but I also think it would be good to have some kind of newsfragment to say that the F&W protocol is being run on the data before analysis.

:+1:

It's also possible (though unlikely) that the scaling in

miller_array = miller_array.french_wilson().as_intensity_array()

could fail - should the fall-back be to use the input data as provided (e.g. the previous behaviour) - I think the F&W scaling has some assumptions embedded in it....

Do you know the anticipated failure mode? Having dug around in cctbx.miller enough last week to want to quit my job, I don't anticipate being able to infer much about expected exceptions from a second glance.

graeme-winter commented 4 years ago

Looks sensible, think this is a good change set, but I also think it would be good to have some kind of newsfragment to say that the F&W protocol is being run on the data before analysis.

👍

It's also possible (though unlikely) that the scaling in

miller_array = miller_array.french_wilson().as_intensity_array()

could fail - should the fall-back be to use the input data as provided (e.g. the previous behaviour) - I think the F&W scaling has some assumptions embedded in it....

Do you know the anticipated failure mode? Having dug around in cctbx.miller enough last week to want to quit my job, I don't anticipate being able to infer much about expected exceptions from a second glance.

TBH I would not worry about precise failure modes, and consider only "this may fail, fall back to ..." - or leave it until a case is reported where it does fail? Then you have a case to demonstrate.

benjaminhwilliams commented 4 years ago

TBH I would not worry about precise failure modes, and consider only "this may fail, fall back to ..." - or leave it until a case is reported where it does fail? Then you have a case to demonstrate.

OK, I think I'll leave this until/unless it proves to be a problem. I don't really want to catch this with a bare except.