European-XFEL / EXtra-data

Access saved EuXFEL data
https://extra-data.rtfd.io
BSD 3-Clause "New" or "Revised" License
7 stars 14 forks source link

read files from a TDC recorded by a MCP-DLD #451

Open daviddoji opened 1 year ago

daviddoji commented 1 year ago

as already discussed with @takluyver, it would be nice to be able to read files recorded by an MCP-DLD that are processed by a TDC.

The path and the data types can be seen in the following screenshot image

where 500 is the trainId chunked by the DAQ and 1000 is the fixed size array needed by DAQ. From those 1000 columns, only N of them would have meaningful data, where N is given by the number of events calculated (not necessarily the same as the number of pulses) by the TDC and can vary from train to train. The rest of the columns are just zeros and could be deleted.

In the following path, one can find a couple of files with "ideally perfect" data (arrays ADC and timeTag will not be present in the data anymore): /gpfs/exfel/exp/SXP/202331/p900367/raw/r0078

Let me know if you need more details

philsmt commented 1 year ago

In which way are those files not readable now? I must assume it's valid EXDF?

daviddoji commented 1 year ago

In which way are those files not readable now? I must assume it's valid EXDF?

They already can be read, sorry for not making it clear. But it would be nice to stack all of them into coordinates, unless you can suggest other layouts

philsmt commented 1 year ago

Ah, thanks for the clarification. This is data interpretation though, not data reading. For this, the extra package (and their components) seems like the right place. As it happens, something similar is being worked on for DLD detector data reconstructed from digitizers directly.

daviddoji commented 1 year ago

Great. Then, we can close this issue.

philsmt commented 1 year ago

Components are specific to the device they cater to, I did not mean to imply this will satisfy your case. Here, one would need an appropriate component interpreting data from the TDC device. Could you please copy this issue over to its repository?

takluyver commented 1 year ago

So reading it at the moment would look something like this:

mcp = run['SXP_MCP/MCP/MCP_BOUND_TEST:output']
x = mcp['data.x'].ndarray()
y = mcp['data.y'].ndarray()
t = mcp['data.t'].ndarray()

I.e. you get multiple arrays which you can use by indexing them in the same way.

David was asking for something like an xarray Dataset, which is a convenient way to work with a group of arrays sharing common axes.

The obvious way to do this at present is a component in EXtra, our new package built on top of EXtra-data. So you would do something like:

mcp = MCPDetector(run)
mcp.get_dataset(['x', 'y', 't'])

If there are likely to be more devices with a similar style of output, we could also consider adding a generic API in EXtra-data, something like:

mcp = run['SXP_MCP/MCP/MCP_BOUND_TEST:output']
mcp.xarray_multi('data.*')