A refactored McSAS for analysis of X-ray and Neutron scattering data.
.. start-badges
| |version| |commits-since| |license| | |supported-versions| |wheel| |downloads| | |cicd| |coverage|
.. |version| image:: https://img.shields.io/pypi/v/mcsas3.svg :target: https://pypi.org/project/mcsas3 :alt: PyPI Package latest release
.. |commits-since| image:: https://img.shields.io/github/commits-since/BAMresearch/McSAS3/v1.0.3.svg :target: https://github.com/BAMresearch/McSAS3/compare/v1.0.3...main :alt: Commits since latest release
.. |license| image:: https://img.shields.io/pypi/l/mcsas3.svg :target: https://en.wikipedia.org/wiki/GNU_General_Public_License :alt: License
.. |supported-versions| image:: https://img.shields.io/pypi/pyversions/mcsas3.svg :target: https://pypi.org/project/mcsas3 :alt: Supported versions
.. |wheel| image:: https://img.shields.io/pypi/wheel/mcsas3.svg :target: https://pypi.org/project/mcsas3#files :alt: PyPI Wheel
.. |downloads| image:: https://img.shields.io/pypi/dw/mcsas3.svg :target: https://pypi.org/project/mcsas3/ :alt: Weekly PyPI downloads
.. |cicd| image:: https://github.com/BAMresearch/McSAS3/actions/workflows/ci-cd.yml/badge.svg :target: https://github.com/BAMresearch/McSAS3/actions/workflows/ci-cd.yml :alt: Continuous Integration and Deployment Status
.. |coverage| image:: https://img.shields.io/endpoint?url=https://BAMresearch.github.io/McSAS3/coverage-report/cov.json :target: https://BAMresearch.github.io/McSAS3/coverage-report/ :alt: Coverage report
.. end-badges
McSAS3
McSAS3 (a refactored version of the original McSAS) fits scattering patterns to obtain size distributions without assumptions on the size distribution form. The refactored version has some neat features:
.. image: https://user-images.githubusercontent.com/5449929/156196219-72472a71-bbd6-4506-a12b-134216deeef6.jpg
Due to an issue with sasmodels when using OpenCL: if you see problems with the fit not matching up at all to the data, disable sasmodels opencl by setting the environment variable SAS_OPENCL=none in the terminal you are launching McSAS3 from.
This package can be installed by ensuring that
1) you have SasModels (pip install sasmodels
) and
2) the most recent 21.4+ version of attrs
, as well as pandas
. After that, you can do git clone https://github.com/BAMresearch/McSAS3.git
in an appropriate location to install McSAS3.
3) On Windows, if you want to use the sasmodels library, it is highly recommended to run pip install tinycc
so that there's a compatible compiler available.
::
pip install mcsas3
You can also install the in-development version with::
pip install git+https://github.com/BAMresearch/McSAS3.git
To run the optimizer from the command line using the test settings and test data, you can run the following command
python mcsas3_cli_runner.py
.
This stores the optimization result in a file named test.nxs. This can subsequently be histogrammed and plotted using the following commmand::
python mcsas3_cli_histogrammer.py -r test.nxs
This is, of course, a mere test case. The result should look like the Figure shown earlier.
To do the same for real measurements, you need to configure McSAS3 by supplying it with three configuration files (two for the optimization, one for the histogramming):
This file contains the parameters necessary to read a data file. The example file for reading a three-column ASCII file, for example, contains:
.. code-block:: yaml
--- # configuration used to read files into McSAS3. this is assumed to be a 1D file in csv format
# Note that the units are assumed to be 1/(m sr) for I and 1/nm for Q
nbins: 100
dataRange:
- 0.0 # minimum
- .inf # maximum. Positive infinity starts with a dot. negative infinity is -.inf
csvargs:
sep: ";"
header: null # null translates to a Python "None", used for files without a header
names: # column names
- "Q"
- "I"
- "ISigma"
Here, nbins is the number of binned datapoints to apply to the data clipped to within the dataRange Q limits. We normally rebin the data to reduce the number of datapoints used for the optimization procedure. Typically 100 datapoints per decade is more than sufficient. The uncertainties are propagated and means calculated from the datapoints within a bin.
The csvargs is the dictionary of options passed on to the Pandas.from_csv function. The thus loaded columns should at least contain columns named 'Q', 'I', and 'ISigma' (the uncertainty on I).
You can also directly load NeXus or HDF5 files, for example you can directly load the processed files that come out of the DAWN software package. The file read configuration for a NeXus or HDF5 file is slightly different. The reader can follow either the 'default' attributes to the data to use, or you can supply a dictionary of HDF5 paths to the datasets to fit (this is the more robust option). For example:
.. code-block:: yaml
--- # configuration used to read nexus files into McSAS3. this is assumed to be a 1D file in nexus
# Note that the units are assumed to be 1/(m sr) for I and 1/nm for Q
# if necessary, the paths to the datasets can be indicated.
nbins: 100
dataRange:
- 0.0 # minimum
- 1.0 # maximum for this dataset. Positive infinity starts with a dot. negative infinity is -.inf
pathDict: # optional, if not provided will follow the "default" attributes in the nexus file
Q: '/entry/result/Q'
I: '/entry/result/I'
ISigma: '/entry/result/ISigma'
The second required configuration file sets the optimization parameters for the Monte Carlo approach. The default settings (shown below) can be largely maintained. You might, however, want to adjust the convergence criterion 'convCrit' for datasets where the uncertainty estimate is not an accurate representation of the datapoint uncertainty. 'nrep' indicates the number of independent optimizations that are run. For tests, we recommend using a small number, from 2-10. For publication-quality averages, however, we usually increase this to 50 or 100 repetitions to improve the averages and the uncertainty estimates on the final distribution. 'nCores' defines the maximum number of threads to use, the repetitions are split over this number of threads.
.. code-block:: yaml
modelName: "mcsas_sphere"
nContrib: 300
modelDType: "default"
fitParameterLimits:
radius: 'auto' # automatic determination of radius limits based on the data limits. This is replaced in McHat by actual limits
# - 3.14
# - 314
staticParameters:
sld: 33.4 # units of 1e-6 A^-2
sld_solvent: 0
maxIter: 100000
convCrit: 1
nRep: 10
nCores: 5
McSAS3 is set up so that if the maximum number of iterations 'maxIter' is reached before the convergence criterion is reached, the result is still stored in the McSAS output state file, and can still be histogrammed. This is done so you can use McSAS3 as a part of a data processing workflow, to give you a first result even if the McSAS settings or data has not been configured perfectly yet.
the fit parameter limits are best left to automatic, in this case the size range for the MC optimization is automatically set by the Q range of your data. This requires the data to be valid throughout its loaded data or preset data limits. Likewise a zero Q value is to be avoided for automatic size range determination.
As for models, the mcsas_sphere model is an internal sphere model that does not rely on a functioning SasModels. Other model names are discovered within the SasModel library.
Absolute intensity calculation has been lightly tested for data in input units of 1/nm for Q and 1/(m sr) for I. In this case, the SLD should be entered in units of $1e-6 1/A^2$, However, bugs in absolute volume determination may remain for a while.
The histogramming configuration example looks like this:
.. code-block:: yaml
--- # Histogramming configuration:
parameter: "radius"
nBin: 50
binScale: "log"
presetRangeMin: 3.14
presetRangeMax: 314
binWeighting: "vol"
autoRange: True
--- # second histogram
parameter: "radius"
nBin: 50
binScale: "linear"
presetRangeMin: 10
presetRangeMax: 100
binWeighting: "vol"
autoRange: False
Lastly, the histogramming ranges have to be configured. This can be done by adding as many entries as requiredd in the histogramming configuration yaml file. Parameter ranges can be set automatic (using the autoRange flag, thus ignoring the presetRangeMin and presetRangeMax values), or by setting fixed limits and leaving autoRange as False.
at the moment, the only bin weighting scheme implemented is the volume-weighted binning scheme, as it is the most reliable. Please leave an issue ticket if you need number-weighting to return.
For each histogramming range, histogram-independent population statistics are also calculated and provided, both in the PDF as well as in the McSAS output state file. These can be read automatically from there later on.
https://BAMresearch.github.io/McSAS3
To run all the tests run::
tox
Note, to combine the coverage data from all the tox environments run:
.. list-table:: :widths: 10 90 :stub-columns: 1
- - Windows
- ::
set PYTEST_ADDOPTS=--cov-append
tox
- - Other
- ::
PYTEST_ADDOPTS=--cov-append tox