reflectometry / refl1d

1-D reflectometry fitting
https://refl1d.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
18 stars 24 forks source link

Post-processing data after fit for display #190

Open bmaranville opened 2 months ago

bmaranville commented 2 months ago

There have been several requests to be able to show a single dataset after fitting for time-of-flight data. The idea is that you do the fit on the unstitched data (separate datasets for all the different incident angles) and then rebin them all into a single dataset for simpler visualization and display in e.g. publications.

One way to accomplish this would be to formalize the rebinning process into a function with known user parameters, available from a custom TOF Experiment class, taking advantage of the custom plots that will be enabled when https://github.com/bumps/bumps/pull/160/ is merged

alexander-grutter commented 2 months ago

I am one of the requesters so I guess I should comment. I want this. Would be nice to support either constant DeltaQ bins or a constant dQ/Q

andyfaff commented 2 months ago
  1. For reproducibility purposes a minimum publishing standard should be that one supplies the reduced data in the same state that was used for the analysis. This can be supplied in ESI or via Zenodo, etc. I believe a script allowing the reader to reproduce the analysis should also be available.
  2. As a reader of a scientific article I would much prefer for data displayed in the figures to be of the same state that was used for analysis.

Unfortunately point 1 often doesn't happen (I'm a major offender still). If that's the case it would be misleading to do the analysis with the data in one state (unbinned/unstitched), then create figures with the data in another state (stitched and binned). It makes it hard for the reader to judge whether the analysis was done correctly, it is also against FAIR principles (analysis is less repeatable). For example, the fit might be poor and look poor when unstitched/unbinned, but it might look great (but still be poor) when stitched/rebinned. Whilst the scientist might not intend to do that, it's always a possibility. All of these arguments depend on how much re-binning is done. The less the data is aggregated, the closer it is to its original form.

One argument might be "there is no difference statistically" (multiple datasets vs a single dataset). If that's the case then the same state should be used for analysis and figure creation.

A possible compromise is to rebin individual angles more, and do the analysis with the angles unstitched (total of N data points). Then when displaying the data in a figure, show the data as a single dataset, angles appropriately scaled, but still with N points. Ok, there will be noise in the overlap regions, but it'll still be authentic.

Lastly, constant Q spacing isn't a good match for most TOF experiments.

For existing spallation source instruments (it'll be different at ESS) the resolution is dominated by the angular terms, until the wavelength contribution from bin sizes is comparable. Rebinning in constant dQ/Q makes more sense. A constant dQ/Q resolution function is simpler to relate, and the data is still spaced equally in log-space. Most TOF reflectometers at reactor sources also operate in constant dQ/Q mode, with data spaced equally in log-space. The only exception is when the chopper phases are opened to provide more flux, at which point dQ/Q becomes a linear function of Q. Binning TOF data into constant Q spacing kind of hides the real manner in which the data was obtained.

I'm interested to know why constant Q spacing is desired. Is it for some kind of ML application?

alexander-grutter commented 2 months ago

This is an issue I really struggle with, and I think that to a certain extent the right answer depends on who we are really trying to communicate with when we publish our papers. I think it's important to remember that there are realistically only about 30 people on earth who are going to be able to follow us through our fitting process in detail. For those people, providing the "analysis state" reduced data and maybe a script is going to be sufficient. However, I have literally never once checked someone else's fit this way. Nobody has ever bothered me about my fits post-publication either.

That being said, implementing your point #1 is unquestionably the best practice and has no real downside. So, we should do it.

On the other hand, because basically nobody reading our papers understands the analysis in detail, the PNR fits I publish are almost exclusively packaged for non-expert consumption. That is, we have to help both the readers and the referees understand what is going on. The referee part is very important - we have been experimenting with point #2 in publications recently, and the response from reviewers would best be described as "overwhelmingly negative". As far as I can tell, referees feel that their job of "evaluating the fit quality" is significantly more challenging when there are many overlapping scans with different error bars sizes, etc. I have a lot of sympathy for this position, as I also have a lot of trouble judging a fit visually prior to rebinning.

So, while I think from a raw scientific perspective you are unquestionably correct, there are many cases where showing the raw, unstitched data state being used to fit can kind of be anti-FAIR, in that non-expert readers might have a harder time judging the fit.

If we really want to help people understand whether a fit is good or bad, I would argue that the best switch we can make is plotting in Log(Fresnel) or Log(Q^4) scaling instead of pure Log(Intensity). More nonsense hides in that plotting choice than in any of these binning questions. This is a total sidetrack, but I can't help myself.

Anyway - Probably the answer is:

  1. Provide the "analysis state" data with all papers
  2. Show something that is reasonably digested by non-expert readers
  3. Clearly explain how the figure was constructed from the "analysis state"

The constant deltaQ binning is for continuous beam sources, which often acquire data this way. What reactor sources have historically done is take multiple theta-2theta scans with points at nominally the exact same angular positions, and then sum those scans. I've come to the conclusion that this should not be done. Each scan should be fit separately to a coupled model. In this case, you would want constant deltaQ rebinning for quick visualization.

alexander-grutter commented 2 months ago

Maranville says I should provide this figure (my current fitting project) to provide context - nobody can possibly interpret this without additional binning. I might as well not show the data and just tell the reader how the data made me feel about the sample :-)

Arguably I should be fitting the data in a different form, but... this feels like the most pure form of it? IDK

image

andyfaff commented 2 months ago

However, I have literally never once checked someone else's fit this way.

I've checked others fits a few times. I would like to do it more frequently. Not really because I doubt what they report, but because I'm trying to learn if there are better ways to analyse their data. This could seeing if there are better models, trying to understand how different instruments handle resolution, or how to improve various aspects of modelling software. For example, I'm looking at maximum entropy for freeform modelling and if I see a system that it might be useful for, then it's nice to have the original data.

Nobody has ever bothered me about my fits post-publication either.

For the community in general (i.e. not about your fits) it's probably not a common desire, but there's also an embarrassment factor in asking, or not wishing to cause offence because there's nothing to be gained if the article is already published.

only about 30 people on earth who are going to be able to follow us through our fitting process in detail

Possibly for the kinds of PNR fitting that are involved. But for unpolarised work I would've thought the co-refinement of individual angles and the rebinning/stitching processes we're talking about here would be followable by a lot more people, I reckon my ex-students could handle that.

I have a lot of sympathy for this position, as I also have a lot of trouble judging a fit visually prior to rebinning.

I have sympathy as well, and pretty much agree with both of your positions. Datasets infested with points are impossible to understand. I just wanted to raise a caution flag.

For TOF NR data with lots of points and several different angles I agree that coupled fitting makes sense. I am guessing that the data you display is for that case. I think the following process would make more sense:

  1. rebin each angle. This reduces the number of datapoints and reduces scatter on points because stats in each bin are better.
  2. do the co-refinement of the multiple angles. Question: if you fit scaling factors during the coupled fitting do the uncertainties on each point have to be amended?
  3. When creating figures plot the data from a given measurement (i.e. the multiple angles) as one series in the same colour. There will be much more clarity as the number of points has been reduced in 1.

It looks like there's hardly any binning in the datasets you displayed. Typically we would rebin to something like 2%, giving us ~130 points out to Q=0.25 (ish). The level of rebinning can be aggressive. Use the largest value that doesn't excessively degrade the Q resolution you need. For the posted figure you could probably even rebin at 3-4% because the feature spacing is wide. This also has the advantage that fits go faster because you reduce the number of points you need to calculate the theoretical curve at. This does cause issues if your resolution smearing code assumes that the point density is high enough such that adjacent points lie within the resolution kernel of each other, oversampling may be needed.

For Platypus the intrinsic wavelength resolution of our chopper system might be 3.3%, with the angular term matching. With no rebinning the overall resolution is sqrt(3.3**2 + 3.3**2) = 4.66. If you rebin at 2% this only degrades the resolution to sqrt(3.3**2 + 3.3**2 + 2**2) = 5.07. Rebinning doesn't dominate because terms add in quadrature. Even if intrinsic wavelength resolution is very good (e.g. spallation sources) rebinning at 2% hardly has an effect, e.g. sqrt(3.3**2 + 2**2) = 3.85.

alexander-grutter commented 2 months ago

It's (ironically?) hard to see the example I posted, but the small amount of binning is actually needed for that dataset. There is high frequency magnetic stuff that is hard to see outside spin asymmetry. Maybe one could bin a little harder, but not that much, and the fits aren't that slow on a cluster.

"1. rebin each angle. This reduces the number of datapoints and reduces scatter on points because stats in each bin are better."

I think maybe the real problem I run into here is when I take a measurement, decide I need more stats, and then circle back. I've concluded those datasets shouldn't be summed, and when I look really closely there's usually a small theta offset. So there are multiple copies of the same angle floating around. It's not a huge effect, but my modeling workflow has gotten progressively more complicated and paranoid about fitting artifacts. Hence the unbinned analysis of repeated runs at the same angle

Anyway, I think I lot of the points you are raising are really important, and you're hitting a lot of the concerns I struggle with.

"2. do the co-refinement of the multiple angles. Question: if you fit scaling factors during the coupled fitting do the uncertainties on each point have to be amended?"

Do you mean to account for the uncertainty on the scaling factor, or just rescaling the uncertainty directly by I0?

andyfaff commented 2 months ago

Do you mean to account for the uncertainty on the scaling factor, or just rescaling the uncertainty directly by I0?

When we splice datasets from individual angles together we calculate a scaling factor that would overlap the second onto the first. This scaling factor comes with an uncertainty. We then multiply the reflectivity in the second angle by the scaling factor, and propagate the uncertainties on the second angle dataset appropriately. To first order we know what the scaling factor should be, because it corresponds to an attenuator transmission. The precise scaling factor is a second order correction to that. So I believe the uncertainty propagation for the second angle dataset can be reduced to multiplying the uncertainties by the first order multiplier.

For your system I'm guessing that you allow the scale factors for all angles to independently vary in a fit. However, isn't there also prior knowledge of the system that could be included? For example, you know that successive angles should overlap, so could the prior PDF for the scale factors be narrowed?