aps-8id-dys / bluesky

XPCS bluesky instrument configuration
Other
2 stars 0 forks source link

How to write a NeXus metadata file with the lambda2m? #13

Open prjemian opened 1 year ago

prjemian commented 1 year ago

@qzhang234 asks (on Teams):

I believe the first step to saving a Nexus file with Lambda2M is to write a Bluesky plan? Would you happen to have a template with Nexus writing to help me get started?

prjemian commented 1 year ago

RE(bp.count([lambda2m])) is the very basic test. Inside a plan:

def my_plan():

    yield from bp.count([lambda2m])

then RE(my_plan())

prjemian commented 1 year ago

Template with NeXus writing:. Since we are using the NXWriter from apstools, there is no existing documentation how to customize yet. Here is the guide demonstrating the NXWriter:

https://bcda-aps.github.io/apstools/dev/examples/fw_nxwriter.html

Instead of NXWriter, we're using NXWriterAPS since it understands the APS. The other is for use outside of APS beamlines.

prjemian commented 1 year ago

In a related issue, writing the run metadata could use standardized methods. We identified in our conf call today that the standardized methods could be more general. The related issue (https://github.com/BCDA-APS/bluesky_training/issues/244) attempts to generalize the standardized methods.

@qzhang234 asked:

... how to use this with the Run Engine? Should I create a plan just like the one shown above, but replace motor_start_preprocessor with bp.count([lambda2m])), and then the plan would execute the count command and create the Nexus file?

Once this generalized method is implemented in the XPCS instrument package, it will not be necessary to change how to the plan is written. You can use the standard plans (bp.count, for example) and any labeled ophyd objects labeled as ad_metadata will be written to the new stream label_start_ad_metadata.

qzhang234 commented 1 year ago

@prjemian I see. So should I add this part to my Bluesky plan and then proceed to the normal bp.count?

image

prjemian commented 1 year ago

That's not the way it works. This part:

RE.subscribe(nxwriter.receiver)

configures the session to record a new NeXus file with every run. So if you called

RE(bp.count([some_area_detector]))

you would get a NeXus file. Then if you called

RE(bp.scan([det1, det2], motor, 10, 20, 11))

you'd get another NeXus file. Same for

RE(AD_Acquire())

you'd get yet another NeXus file. No extra setup code needed.

prjemian commented 1 year ago

As we discussed on Friday, the default behavior of the NXWriter() will try to copy the area detector image from the EPICS IOC-created file into the bluesky-created NeXus file. That's not what we want for XPCS. We'll need to provide some local changes to the standard NXWriter code.

prjemian commented 1 year ago

Note for me: write_stream_external() is called when the area detector HDF plugin is used. Probably this code is the one to revise or replace.

prjemian commented 12 months ago

Not sure we want this as the final code in the MyNXWriter class, but it is a starting point:

    def write_stream_external(self, parent, d, subgroup, stream_name, k, v):
        resource_id = self.get_unique_resource(d)
        fname = self.getResourceFile(resource_id)

        ds = subgroup.create_dataset("file", data=fname.name)
        h5addr = "/entry/data/data"
        ds.attrs["target"] = ds.name
        ds.attrs["source_file"] = str(fname)
        ds.attrs["source_address"] = h5addr
        ds.attrs["resource_id"] = resource_id
        ds.attrs["shape"] = v.get("shape", "")

        subgroup["external"] = h5py.ExternalLink(str(fname), h5addr)

    def get_unique_resource(self, d):
        # count number of unique resources (expect only 1)
        resource_id_list = []
        for datum_id in d:
            resource_id = self.externals[datum_id]["resource"]
            if resource_id not in resource_id_list:
                resource_id_list.append(resource_id)
        if len(resource_id_list) != 1:
            # fmt: off
            raise ValueError(
                f"{len(resource_id_list)}"
                f" unique resource UIDs: {resource_id_list}"
            )
            # fmt: on
        return resource_id_list[0]

On my local workstation, it results in this HDF5 structure:

            adsimdet --> /entry/instrument/bluesky/streams/primary/adsimdet_image
            adsimdet_image:NXdata
              @NX_class = "NXdata"
              @signal_type = "detector"
              @target = "/entry/instrument/bluesky/streams/primary/adsimdet_image"
              EPOCH:NX_FLOAT64 = 1693267642.3699787
                @long_name = "epoch time (s)"
                @target = "/entry/instrument/bluesky/streams/primary/adsimdet_image/EPOCH"
                @units = "s"
              external: missing external file
                @file = "/tmp/docker_ioc/iocad/tmp/adsimdet/2023/08/28/e355b37c-0d71-4dc5-a5b4_000000.h5"
                @path = "/entry/data/data"
              file:NX_CHAR = b'e355b37c-0d71-4dc5-a5b4_000000.h5'
                @resource_id = "5357ad07-3d01-46eb-b9de-969c9ae708cf"
                @shape = [ 100 1024 1024]
                @source_address = "/entry/data/data"
                @source_file = "/tmp/docker_ioc/iocad/tmp/adsimdet/2023/08/28/e355b37c-0d71-4dc5-a5b4_000000.h5"
                @target = "/entry/instrument/bluesky/streams/primary/adsimdet_image/file"
              time:NX_FLOAT64 = 0.0
                @long_name = "time since first data (s)"
                @start_time = 1693267642.3699787
                @start_time_iso = "2023-08-28T19:07:22.369979"
                @target = "/entry/instrument/bluesky/streams/primary/adsimdet_image/time"
                @units = "s"
prjemian commented 12 months ago

Note external: missing external file is a local situation on my workstation. If the external file was accessible, then the external would actually show the data in the external file.

qzhang234 commented 12 months ago

@prjemian This is fantastic! Should I do a git pull on Kouga?

Also, per our discussion today, how do I unsubscribe nxwriter? Not super important at this moment, just curious

prjemian commented 12 months ago

Nothing to pull well use the old copy and paste technique.

To unsubscribe, we need the integer key that was returned to us when we first subscribed. Since we did not store that key, it's not easy to get it later. That's why I had you comment out that part in the setup.

prjemian commented 12 months ago

revision to one of the above methods:

    def write_stream_external(self, parent, d, subgroup, stream_name, k, v):
        resource_id = self.get_unique_resource(d)
        fname = self.getResourceFile(resource_id)

        h5addr = "/entry/data/data"
        ds = h5py.ExternalLink(str(fname), h5addr)  # TODO: check the path
        ds.attrs["target"] = ds.name
        ds.attrs["source_file"] = str(fname)
        ds.attrs["source_address"] = h5addr
        ds.attrs["resource_id"] = resource_id
        ds.attrs["shape"] = v.get("shape", "")
        subgroup["value"] = ds
prjemian commented 2 months ago