Ideas to improve fitting function

While it sometimes works out that the fitting function works right out of the gate for each peak, I've found that to be a fairly rare occurrence. But I've found some hints as to what went sideways and from there some next things to try. So I'd like to have these added to the fitting process. I've tried to list them out by what they're addressing and how to tell this was an issue; as well as sorted by how likely I think they occur.

Issue: Fit function is fitting noise Cause: The lattice spacing value (_cell_length_a) from the CIF file doesn't match experimental data (value is often taken from reference literature or prior data). GSAS-II will sometimes latch onto noise instead of a real peak since the GUI process to assign the peak location depends on the user finding a high and visible peak. The scripting process is blind to the display. Tells: Negative intensity fit values, sigma values that are much larger than nearby peaks and/or negative. Fixes: Hopefully at least one peak per phase are fit. Use the two_theta value and wavelength to get a new estimate for the lattice spacing value and try again. Can also try a series of locations.
Issue: Peak fit wandered to an adjacent peak Cause: Similar to fitting noise. Tells: Mismatch in the two_theta fit and theoretical intensity values. It's common for there to be a bit of a mismatch between these values, but the ratio should be about constant for each peak. (That's part of why I added a plot of these two values, but the variation isn't too visible without a diagonal line) Fixes: Similar to fitting noise, try to drop the peak list location closer to the real peak. We've also talked over submitting a series of peak locations (5 or 10) and see if we land on a peak since we're doing things blind.
Issue: Peak overlap Cause: Peak fit program expects separate peaks. Instrument effects resulting in peak broadening, small phase fractions, and/or phases with similar two theta values can cause overlap. Tells: Values for the theoretical intensity two_theta will be similar. Maybe use the sig or gam values (I assume the units are somehow related to two_theta) to estimate the amount of overlap? I added Example06 which will have some issues with overlap and we can use to test (as well as known phase fraction values since I simulated it :) ). Fixes: There's a separate fitting approach (set_refinement, see https://gsas-ii.readthedocs.io/en/latest/GSASIIscriptable.html#code-examples) that we could use. This would lock the lattice parameters Unfortunately it doesn't easily deal with textured samples (i.e. substantial variation in the n_int values) right now, but it seems possible to integrate this in.
Issue: Background doesn't fit well Cause: I picked '5' as the number of Chebyshev polynomials to use (line 125 as of commit 92f0dea). Completely arbitrary choice here, but seemed to work well for example data. Tells: Values for the number of polynomials that are too low means the background fit data diverges from the experimental data, usually at the edges of the data. Too high of value can result in the peaks being fit with the 'background' and then not counted. Also, since there are usually more datapoints in the 'background' of the data, errors here tend to result in mismatches in the peak values to compensate during the least squares minimization. Fixes: Check the background fit. I'm not sure how to do this automatically; some ideas would be checking error at the ends, masking the peak locations and checking, and/or iterating through a few different values and checking the fit quality (one of the things reported during fitting).

For adding the Lortenzian (gam), here's the changes to add/try (line 155 as of commit 92f0dea): add a new line hist.set_peakFlags(pos=True,area=True,gam=True) and hist.refine_peaks('hold')

This should allow the prior sig value to stay. The 'hold' option was added in GSAS-II version 4821 to refine_peaks to keep the prior fit values from being overriden with the instrument parameter values. If successful, the sides of the peak should be fit better, and the values for gam likely will increase. Hopefully the overall shape gets better as well.

We can also add a step where the sig and gam are both fit (set_peakFlags has 'True' for both of them), but they tend to be highly correlated and make the fit get squirrely unless the prior values are pretty much spot on... The advantage is that we then get uncertainties for each one...

usnistgov / AusteniteCalculator

Ideas to improve fitting function #15