vasole / pymca

PyMca Toolkit git repository
Other
57 stars 51 forks source link

Error when saving fit with option csv/dat turned on (save as ascii) using fitMultipleSpectra #1059

Closed ptim0626 closed 4 months ago

ptim0626 commented 5 months ago

When OutputBuffer is initialised, if saveFit and/or saveResiduals is set to True when csv and/or dat is also enabled, it produces the following error when using FastXRFLinearFit.fitMultipleSpectra:

Traceback (most recent call last):
  File "<...>/pymca_fast_fit_test.py", line 85, in <module>
    fast_fit()
  File "<...>/pymca_fast_fit_test.py", line 72, in fast_fit
    with outbuffer.saveContext():
  File "<...>/lib/python3.10/contextlib.py", line 142, in __exit__
    next(self.gen)
  File "<...>/PyMca5/PyMcaIO/OutputBuffer.py", line 704, in saveContext
    self.save()
  File "<...>/PyMca5/PyMcaIO/OutputBuffer.py", line 745, in save
    self._saveImages()
  File "<...>/PyMca5/PyMcaIO/OutputBuffer.py", line 836, in _saveImages
    ArraySave.save2DArrayListAsASCII(imageList, fileName, csv=True,
  File "<...>/PyMca5/PyMcaIO/ArraySave.py", line 191, in save2DArrayListAsASCII
    fileline += "%s%g" % (csvseparator, datalist[i][row, col])
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

I am using 5.9.0, and when I switch to 5.5.5 there is no error (with Python 3.10). I guess the changes in #710 introduced the issue, after commenting out the block it works in 5.9.0. The save2DArrayListAsASCII method tries to save the derivative as an image, but it is a 1-dimensional array with nan value.

The following example (adapted from the tests) should illustrate the issue:

import os
import tempfile
import shutil

import numpy
import h5py

from PyMca5.tests import XrfData
from PyMca5.PyMcaPhysics.xrf import FastXRFLinearFit
from PyMca5.PyMcaPhysics.xrf.XRFBatchFitOutput import OutputBuffer
from PyMca5.PyMcaIO import HDF5Stack1D

def fast_fit():
    # setup
    fp = tempfile.mkdtemp(prefix="pymca")

    # generate the data
    data, livetime = XrfData.generateXRFData()
    configuration = XrfData.generateXRFConfig()
    configuration["fit"]["stripalgorithm"] = 1

    # create HDF5 file
    fname = os.path.join(fp, "FastXRF.h5")
    h5 = h5py.File(fname, "w")
    h5["/data"] = data
    h5["/data_int32"] = (data * 1000).astype(numpy.int32)
    h5.flush()
    h5.close()

    fastFit = FastXRFLinearFit.FastXRFLinearFit()
    fastFit.setFitConfiguration(configuration)

    outputDir = '/tmp/'
    outputRoot = ""
    fileEntry = ""
    fileProcess = ""
    refit = None
    weight = 0
    tif = 0
    edf = 0
    csv = 1             # 1, not work; 0, work
    h5 = 1
    dat = 1             # 1, not work; 0, work
    concentrations = 0
    diagnostics = 0
    debug = 0
    overwrite = 1
    multipage = 0

    outbuffer = OutputBuffer(
        outputDir=outputDir,
        outputRoot=outputRoot,
        fileEntry=fileEntry,
        fileProcess=fileProcess,
        diagnostics=diagnostics,
        tif=tif,
        edf=edf,
        csv=csv,
        h5=h5,
        dat=dat,
        multipage=multipage,
        overwrite=overwrite,
        saveFit=True,          # True, not work; False, work
        saveResiduals=True,    # True, not work; False, work
    )

    # test standard reading
    scanlist = None
    selection = {"y": "/data"}
    dataStack = HDF5Stack1D.HDF5Stack1D([fname], selection, scanlist=scanlist)
    with outbuffer.saveContext():
        fastFit.fitMultipleSpectra(
            y=dataStack,
            weight=weight,
            refit=refit,
            concentrations=concentrations,
            outbuffer=outbuffer,
        )

    # tear-down
    shutil.rmtree(fp)

if __name__ == '__main__':
    fast_fit()

I am not a seasoned user of PyMca so if the above setup doesn't make sense, please give me some advice on how I should configure the fast fit. Thank you.

vasole commented 5 months ago

Just to let you know that I'm looking into it.

vasole commented 4 months ago

Effectively it is impossible to save in the same csv or dat file something that does not have the same dimensions as the map and only the HDF5 format offers the full capabilities to store everything in a single file.

The simplest solution would be to skip saving that information into those formats.

vasole commented 4 months ago

I am not a seasoned user of PyMca so if the above setup doesn't make sense, please give me some advice on how I should configure the fast fit. Thank you.

I recommend you to get used to the program using the graphical user interface prior to write your scripts.

To configure the fast fit, the simplest approach is to use PyMca itself to generate the fit configuration file and then load it into your scripts.

The code below illustrates how to read the fit configuration file


    from PyMca5.PyMcaPhysics.xrf import FastXRFLinearFit
    from PyMca5.PyMcaIO import ConfigDict
    configuration = ConfigDict.ConfigDict()

    configuration.read("your_fit_configuration_file.cfg")
    fastFit = FastXRFLinearFit.FastXRFLinearFit()
    fastFit.setFitConfiguration(configuration)
ptim0626 commented 4 months ago

Thanks @vasole for the fix and suggestion! I inherited the scripts from others, and I will definitely explore the GUI myself to understand the various options.