Add fit parameters to metadata for saved files (pulsed, ODMR)

NicolasStaudenmaier commented 1 year ago

Fit parameters in pulsed and ODMR are now saved in the metadata of data files when saving a measurement.

Description

Adding the fit model name and the fit parameters according to the formatted result output. Depending on the availability of a fit for the (alternative) data in pulsed or on the channel and range in ODMR the data is added. If no fit is available, no fit data is added to the metadata. In pulsed the fit is saved in the pulsed_measurement data file, and in ODMR in all text files.

Motivation and Context

In pulsed the fit parameters are only printed next to the figure in the png file. For ODMR no fit parameters are saved at all. It might be useful to access the fit parameters from the text file.

Solves issue #88.

How Has This Been Tested?

Tested with dummy configuration of the current version.

Screenshots (only if appropriate, delete if not):

Types of changes

[x] Bug fix
[ ] New feature
[ ] Breaking change (Causes existing functionality to not work as expected)

Checklist:

[x] My code follows the code style of this project.
[ ] I have documented my changes in /docs/changelog.md.
[ ] My change requires additional/updated documentation.
[ ] I have updated the documentation accordingly.
[ ] I have added/updated the config example for any module docstrings as necessary.
[x] I have checked that the change does not contain obvious errors (syntax, indentation, mutable default values, etc.).
[ ] I have tested my changes using 'Load all modules' on the default dummy configuration.
[ ] All changed Jupyter notebooks have been stripped of their output cells.

LukePiewalker commented 1 year ago

Quick comment after todays meeting, which I try to wrap up without looking into the changes introduced here: Maybe, could we use the summary method of the ModelResult class in lmfit?

To already have it in some dictionary which then could be used? If it turns out to in the end be too many lines of code in the end, or something really general, one could think about whether this is some functionality for the core.

timoML commented 1 year ago

I looked into .summary(). Looks like this dict would contain what we need in ['params']. However, it was only introduced in lmfit 1.2.0 (5/2023) and would require a change in our setup requirements. For now, I introduced a qudi-core helper here.

Neverhorst commented 1 year ago

I looked into .summary(). Looks like this dict would contain what we need in ['params']. However, it was only introduced in lmfit 1.2.0 (5/2023) and would require a change in our setup requirements. For now, I introduced a qudi-core helper here.

If the modules in the qudi-iqo-modules addon package are using this feature, it would make sense to just include this package version requirement in its setup.py, e.g. lmfit>=1.2.0. Problem solved and no need to clutter qudi-core with something that is outdated anyway. 👍

timoML commented 1 year ago

Unfortunately, it is more complicated than I suspected: lmfit's .summary() returns the parameters as a tuple (not a dict) with hard coded order. When dumping, this yields human unreadable results like:

'params': [('offset', 200371.95891243816, True, None, 186042.13771090773, 208322.72278583102, None, 123.92946774026674, {'center': 2.0450782521128066e-06, 'sigma': 0.2311852154836438, 'amplitude': -0.1359148619646129}, 200074.4790258307, None), ('center', 2867441428.334397, True, None, 2770000000.0, 2970000000.0, None, 308997.90197928884, {'offset': 2.0450782521128087e-06, 'sigma': -1.2424267099560231e-05, 'amplitude': -2.0366472860784767e-05}, 2868000000.0, None), ('sigma', 2751341.5731732002, True, None, 1000000.0, 100000000.0, None, 317605.2799702476, {'offset': 0.23118521548364368, 'center': -1.2424267099560231e-05, 'amplitude': 0.5250577474849855}, 1944560.4652284337, None), ('amplitude', -6567.367922722109, True, None, -6567.367922731617, 0, None, 64.77947751158845, {'offset': -0.13591486196461305, 'center': -2.0366472860784767e-05, 'sigma': 0.5250577474849853}, -3283.6839613658085, None)]}

while we would prefer a human readable, dict-like style (as with the added core .dict_result() function): 'offset': {'value': 200917.5114554685, 'min': 188589.88295235785, 'max': 207777.30558512648, 'stderr': 115.82321083006927}, 'center': {'value': 2869935442.325448, 'min': 2770000000.0, 'max': 2970000000.0, 'stderr': 521566.4696499122}, 'sigma': {'value': 4719809.358276054, 'min': 1000000.0, 'max': 100000000.0, 'stderr': 549534.733591269}, 'amplitude': {'value': -4465.000169262333, 'min': -6501.286534228013, 'max': 0, 'stderr': 435.03743948837626}}

This means that also with .summary(), we need a helper that makes it human readable.

@Neverhorst I implemented both ways here. What do you prefer:

New style with .dict_result() helper in qudi-core. Will require bumping lmfit to 1.2.0 in qudi-core.
As 1, but we move the helper (a bit artifically) to qudi-iqo-modules
Old style

Neverhorst commented 1 year ago

@timoML , ok I see now, it is not so simple.

While looking into the matter I realised however that we should probably agree on what information we want to be included in the file header of each data file. In your example you also save the boundary conditions (min and max) together with the fit result (value and stderr). This seems a bit arbitrary to me because you combine fit result with fit model information but only part of it. For the sake of consistency I would propose either saving the entire information contained in lmfit.ModelResult or just the parameter results, i.e. value +- stderr. The first will not be human readable and needs tooling to extract the full ModelResult object from a data file while the latter can be quite easily implemented and is human readable (my favourite).

This question again reiterates the old conflict of scientists wanting to hoard every shred of information in this single file (even though it is never accessed in 99.9% of the time) vs. easy use and readability. From experience I would strongly advocate the path of readability. The new way of fitting in qudi allows to perform the fit again any time without starting qudi if the need arises (e.g. you need the fit statistics/metrics for a paper). So as long as the file header contains enough information to reproduce the fit, I see no reason to clutter it with meaningless information that defeats the benefits of a plain text, human-readable file in the first place. If you want maximum information, you create proper databases and work with queries/tooling to retrieve pieces of information and do not use text files.

TL;DR: I would propose the fit information dict should look like:

{'offset' : '(200.92 +- 0.12)kc/s', 
 'center': '(2.8699 +- 0.0006)GHz',
 ...}

And of course the metadata should also contain the fit type.

timoML commented 1 year ago

@Neverhorst I think there is currently no general way of saving the units with arbitrary fit parameters. Eg. take the dimension-less parameter beta in streched decays. Of course the x/y axis units are stored in the meta data property 'column_headers'.

Neverhorst commented 1 year ago

@Neverhorst I think there is currently no general way of saving the units with arbitrary fit parameters. Eg. take the dimension-less parameter beta in streched decays. Of course the x/y axis units are stored in the meta data property 'column_headers'.

Have a look at how I implemented that here. Pretty much in the same way as FitContainer.formatted_result. The logic can pass units to the helper function. If none are given, none are formatted into the string. Independent on how the result is formatted in the end (string vs. dict) I would prefer this structure because it is in line with the current way of handling results (all in the same place and works the same way).

timoML commented 1 year ago

I see, I like the mechanism. (Although I think our current logics would hardly know the units for arbitrary fit functions.)

Btw: What's the reasoning in dropping fixed parameters in formatted_result? For me it's a bit unexpected that a fixed parameters won't show eg. in the pulsed gui fit result.

Neverhorst commented 1 year ago

I see, I like the mechanism. (Although I think our current logics would hardly know the units for arbitrary fit functions.)

True, but at least you have a mechanism where this could be customized in the future.

Btw: What's the reasoning in dropping fixed parameters in formatted_result? For me it's a bit unexpected that a fixed parameters won't show eg. in the pulsed gui fit result.

I think this was a quick fix because non-varying parameters have no stderr set. Also it's a matter of taste because some simplified fit models are derived from more complex models with some parameters being fixed (e.g. offset) and having them around would give a false impression on model complexity. But yeah, I also understand the desire to have all shown. We could include them if you like but we should then make it a bit more complex by flagging the fixed parameters as such in the output string.

timoML commented 11 months ago

After our internal coordination, I implemented the dict solution here. I also intruduced a distinction mechnasim between fixed parameters and unsucessfull fits:

@Neverhorst If you approved, I can continue with the PR in core before merging this PR here.

Ulm-IQO / qudi-iqo-modules