NSLS-II-SIX / sixtools

Software for performing resonant inelastic xray scattering analysis at NSLS-II
https://pypi.org/project/sixtools
Other
1 stars 2 forks source link

adding a the new handler here temporarily while the PR makes its way through to DB #33

Closed awalter-bnl closed 5 years ago

awalter-bnl commented 5 years ago

This will temporarily add the new single mode hdf5 file handler here while it makes it's way into a databroker release.

codecov-io commented 5 years ago

Codecov Report

Merging #33 into master will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #33   +/-   ##
=======================================
  Coverage   43.26%   43.26%           
=======================================
  Files           3        3           
  Lines         282      282           
=======================================
  Hits          122      122           
  Misses        160      160

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4524808...1f2dc00. Read the comment docs.

tacaswell commented 5 years ago

That is not just a travis error, I would expect to see that NameError on import in all cases.

If we do want to go down the route of returning h5py objects, then we probably should do some caching of them internally so we do not end up with N references to any given file running around.

ambarb commented 5 years ago

Sorry guys, this was my best guess as to where to put this requested information that I found in the EPICS software manual (page 38). Note that the summed region is hardcoded in the CSS page as 3x3, but the user can use either 3x3 or 2x2 methodologies.

@bisogni this is related to the lunch time discussion and I was asked to get back to DAMA with the column headings ASAP.

XIP data file format: 7 columns

column 1: x column 2: y column 3: x eta correction column 4: y eta correction column 5: y eta correction & isolinear correction column 6: "sum of 3x3 region" (though, this could also mean 2x2 region) column 7: XIP mode (which I assume to be either 3x3 or 2x2)

In the end, column 3, 5, and 6 are the X, Y, I information that will need to be converted in to a spectrum, where X and Y will be used to determine the energy (x-axis) to be plotted against I. However, I think we want to retain access to all columns.

Both modes ( 3x3 and 2x2 ) were collected over the last days. We can provide particular scan numbers for testing.

awalter-bnl commented 5 years ago

THanks @ambarb here is fine for the data.

tacaswell commented 5 years ago

Thanks, that is perfect @ambarb

mpmdean commented 5 years ago

Sorry guys, this was my best guess as to where to put this requested information that I found in the EPICS software manual (page 38). Note that the summed region is hardcoded in the CSS page as 3x3, but the user can use either 3x3 or 2x2 methodologies.

@bisogni this is related to the lunch time discussion and I was asked to get back to DAMA with the column headings ASAP.

XIP data file format: 7 columns

column 1: x column 2: y column 3: x eta correction column 4: y eta correction column 5: y eta correction & isolinear correction column 6: "sum of 3x3 region" (though, this could also mean 2x2 region) column 7: XIP mode (which I assume to be either 3x3 or 2x2)

In the end, column 3, 5, and 6 are the X, Y, I information that will need to be converted in to a spectrum, where X and Y will be used to determine the energy (x-axis) to be plotted against I. However, I think we want to retain access to all columns.

Both modes ( 3x3 and 2x2 ) were collected over the last days. We can provide particular scan numbers for testing.

I agree we want access to all columns. I would suggest that we pass the raw photon positions into our datastream and do the (very easy) curvature ourselves otherwise we will get into problems with one curvature correction happening XCAM's software and another happening in our data processing.

tacaswell commented 5 years ago

@mpmdean I think we would return you a pd.DataFrame with all of the columns and you can then do what you want with them on the analysis side.

mpmdean commented 5 years ago

@tacaswell sure that sounds good. @ambarb I hope this is not all the information returned by the plugin as it has ommited any information about pixel clusters that the algorithm has identified as having more than one photon in them!

awalter-bnl commented 5 years ago

I have commited changes to make the return data type a pandas dataframe. THis has been tested on file scan_id = 26500

@ambarb , @bisogni and @mpmdean please look at the column labels in self._column_names to make sure my abreviations make sense.

At somepoint I will need to update the spec in the filehandler in NSLS-II-SIX/Profile_collection so it will work with the new spec in here. I will co-ordinate with @bisogni as to the best time to do this.

ambarb commented 5 years ago

@mpmdean Do you get this info with the PSI software? Perhaps what they have is different from what we have from the vendor since PSI owns the rights to the algorithm. Can you provide a more detailed list of data that you get from PSI?

The algorithm that is implemented by the vendor is supposed to reject events that are not one photon. We do see in EPICS how many events are rejected from possible single photon events. Perhaps there are more outputs from the algorithm that we can potentially ask to have uncovered by the EPICS XIP plugin (provided by the vendor).

Conversely, we are recording the actual images to disk so that we do not necessarily have to rely on a black box for the single photon counting.

awalter-bnl commented 5 years ago

responding to

Suggested changes to make more generic for future use.

@ambarb suggeted a really nice idea which was to pass column_names to the handler via the entry in asset store i.e. place it in the insert.

I just want to clarify, I think you want to have this as a kwarg in AreaDetectorHDF5SingleHandler_centroid7xn here? Which would be passed in to resource kwargs in FileStoreHDF5Single here. Is this what you had in mind?

I am not 100% convinced of this, on the plus side it would make AreaDetectorHDF5SingleHandler_centroid7xn completely generic but would mean making a bespoke version of FileStoreHDF5Single.

Perhaps this change could go into RIXScamHDF5pluginforXIP here which we already make as a bespoke version.

The only major concern I have with this is what would we make as the 'default' value for column_names so that this would work without the bespoke version (i.e as a general version). The best idea I can come up with is to make this None by default which would mirror the Pandas.DataFrame default behaviour, but is that really a good option?

What do others think?

stuwilkins commented 5 years ago

Thanks @awalter-bnl that is what I was thinking of. I was thinking that the column names and even the hard coded key should be defined at data creation time as they are defined by the data source (in this case the HDF5 plugin) this makes the data retrieve handlers much more generic. It’s just a thought, I realize it could be overkill. I don’t agree that it makes the data entry handlers more bespoke as they are really defined by the data creation which is by definition bespoke especially in this case

mpmdean commented 5 years ago

@mpmdean Do you get this info with the PSI software? Perhaps what they have is different from what we have from the vendor since PSI owns the rights to the algorithm. Can you provide a more detailed list of data that you get from PSI?

The algorithm that is implemented by the vendor is supposed to reject events that are not one photon. We do see in EPICS how many events are rejected from possible single photon events. Perhaps there are more outputs from the algorithm that we can potentially ask to have uncovered by the EPICS XIP plugin (provided by the vendor).

Conversely, we are recording the actual images to disk so that we do not necessarily have to rely on a black box for the single photon counting.

@ambarb

The PSI output also has "bad events" i.e. an image of events that didn't properly centroid.

tacaswell commented 5 years ago

I just want to clarify, I think you want to have this as a kwarg in AreaDetectorHDF5SingleHandler_centroid7xn here? Which would be passed in to resource kwargs in FileStoreHDF5Single here. Is this what you had in mind?

I think this is a very good idea.

Changing to a file format that better represents that this is tabular data (and include ex, the column names) would be even better, but would require changes to the AD IOC, the ohpyd AD class and a new handler.

The only major concern I have with this is what would we make as the 'default' value for column_names so that this would work without the bespoke version (i.e as a general version).

Make column_names required with no default value?

awalter-bnl commented 5 years ago

Based on the comments above and a discussion in the DAMA weekly meeting I have updated this to a single handler which takes in key and column_names as kwargs. Default values have been provided for back-compatibility so that existing files taken at SIX will work.

I will update the playbooks files to create maps from both 'AD_HDF5_SINGLE' (current spec saved to resource_kwargs) and 'AD_HDF5_XIP' (future spec saved in resource_kwargs) to this handler.

I will also place a PR against profile_collection to update the bespoke RIXScamHDF5PluginForXIP to include the filestore_spec 'AD_HDF5_XIP' and column_names in the resource_kwargs