DiamondLightSource / mx-bluesky

Bluesky plans, plan stubs, and utilities for MX beamlines
https://diamondlightsource.github.io/mx-bluesky/
Apache License 2.0
0 stars 2 forks source link

Take some fluorescence hdf files with the xspress3 #236

Open DominicOram opened 4 months ago

DominicOram commented 4 months ago

GDA reads the data out of the xspress3 directly from PVs then writes it to it's own file format. However, the xspress3 IOC has the ability to write HDF files, which we would like to use. The first step in doing this is spending some time on the beamline manually triggering the fluorescence detector and getting an example file out. This can then be passed to analysis so that they can update their pipelines to work for it.

Acceptance Criteria

DominicOram commented 4 months ago

When we do this we should get @pblowey and @Relm-Arrowny involved

DominicOram commented 3 months ago

Time put aside on the afternoon of 11th July for this

DominicOram commented 3 months ago

Tested with:

from bluesky import plan_stubs as bps
from bluesky.run_engine import RunEngine
from dodal.beamlines import i03
from dodal.devices.zebra_controlled_shutter import ZebraShutterState

RE = RunEngine()

shutter = i03.sample_shutter()
fluo_det = i03.xspress3mini()

def my_plan():
    yield from bps.abs_set(shutter, ZebraShutterState.OPEN)
    yield from bps.abs_set(fluo_det.acquire_time, 0.01)
    yield from bps.stage(fluo_det, wait=True)
    yield from bps.unstage(fluo_det, wait=True)
    yield from bps.abs_set(shutter, ZebraShutterState.CLOSE)

RE(my_plan())

File produced at /dls/i03/data/2024/cm37235-3/fluo_tests/test.h5 sending to analysis properly in https://github.com/DiamondLightSource/mx-bluesky/issues/233. PR incoming with hotfixes to tidy up

DominicOram commented 3 months ago

To get this to work we had to set BL03I-EA-XSP3-01:HDF5:FileTemplate to %s/%s.h5

DominicOram commented 3 months ago

Blocked on https://github.com/DiamondLightSource/dodal/issues/679

DominicOram commented 3 months ago

Do next Friday?

pblowey commented 3 months ago

An update from my side of things as some comments/suggestions:

I've now got a PR ready (DiamondLightSource/python-zocalo-pymca/pull/8) for the changes to the pymca_fitter service that will allow it to process h5 files. Full details can be found in that PR. I've also updated the pymca_fitter recipe to accept cfgFile and h5path as custom parameters. cfgFile allows a specific config file path for the pymca fit to be applied. If not specified it defaults to the config file for the beamline, (it determines which beamline from the input file path. h5path allows a specific path to the data within the h5 file to be set, though I will set the default to whatever path we agree to store the data at in the file, but this parameter might be useful for testing purposes.

In terms of storing the data, I've found this documentation for writing MCA data using NeXus conventions. Would you be able to get the hdf5 file output by the xpress3 into this format? It would make the hdf5 file a lot more user friendly for any users who want to fit the data themselves in PyMCA as the relevant meta data is stored with the measurement. In that documentation, it suggests having calibration, channels and live time as meta data stored in the file. I think, since we are recording data from all channels in a straight-forward way (0-4095), we don't need the channels stored as meta data. I also don't know if the live time is something that can be readily obtained from the xpress3. The calibration needs to be stored in the file as a dataset in the format [zero, gain, non-linearity_parameter]. The calibrations are currently stored in the PyMCA config files for each beamline.

The calibrations for each beamline's MCA will need to be redone as currently GDA does not store all channels and (At least on some beamlines) applies an offset. In principle, this would only require changing the the zero parameter but it would make sense to redo the calibration completely to be sure that the calibration is good over the desired region of interest. In any case, while GDA and hyperion are both in operation, we'll need separate pymca config files as they'll require different calibrations.

DominicOram commented 3 months ago

In terms of storing the data, I've found this documentation for writing MCA data using NeXus conventions. Would you be able to get the hdf5 file output by the xpress3 into this format?

Maybe, the other option, which is probably preferable, is that we wrap the raw h5 with a VDS link like we do for the diffraction data. Do we have to use that convention or are we able to use https://manual.nexusformat.org/classes/applications/NXfluo.html#nxfluo? I think I would prefer a defined application definition if possible

pblowey commented 3 months ago

Maybe, the other option, which is probably preferable, is that we wrap the raw h5 with a VDS link like we do for the diffraction data. Do we have to use that convention or are we able to use

Just to check that I understand, with the diffraction data, do you write hdf5 file(s) containing the raw diffraction data as well as an hdf5 file containing the metadata, which are then linked together in a .nxs file? I don't know enough about this to know why that's preferable, is it just to save you from editing the raw data file?

https://manual.nexusformat.org/classes/applications/NXfluo.html#nxfluo? I think I would prefer a defined application definition if possible

I wasn't aware of this specific definition but I think, as long as you store the calibration at /entry/instrument/fluorescence/calibration (assuming data is at /entry/instrument/fluorescence/data), the PyMCA GUI would pick it up.

DominicOram commented 3 months ago

Just to check that I understand, with the diffraction data, do you write hdf5 file(s) containing the raw diffraction data as well as an hdf5 file containing the metadata, which are then linked together in a .nxs file? I don't know enough about this to know why that's preferable, is it just to save you from editing the raw data file?

Yes, if you take a look in e.g. /dls/i03/data/2024/cm37235-1/TestInsulin/ins_1 there are ins_1_9_*.h5 which the detector writes then ins_1_9.nxs, which DAQ writes. The reason for this is that there is a lot of metadata that the detector isn't aware of e.g. in the diffraction case motor positions. In this case it doesn't look like the detector is aware of the calibration. It would probably be possible to change the xspress3 IOC so that we can pass this metadata too it but I think doing it the same way as the diffraction data, where we wrap it in a nexus file, has a nice symmetry and means that we could more easily add extra metadata later too.

I wasn't aware of this specific definition but I think, as long as you store the calibration at /entry/instrument/fluorescence/calibration (assuming data is at /entry/instrument/fluorescence/data), the PyMCA GUI would pick it up.

Ah, so it has to have a calibration curve rather than an array of energies like nxfluo has? That's fine I guess

pblowey commented 3 months ago

I just had a play around, seeing what the PyMCA GUI would do if I provided energy in the nxfluo format. While it was possible to get it to display a spectrum using the energy values on the x-axis, it doesn't then treat the spectrum as an MCA spectrum - it seems to treat it as if the x-axis is position rather than energy. As far as I can tell, the only option is to put a calibration in the file. Would there be an issue creating the file in the NXfluo definition but then also adding /entry/instrument/fluorescence/calibration to it, so you have the data in the proper Nexus format but then also enable PyMCA to read the data correctly?