Open daviddoji opened 1 year ago
In which way are those files not readable now? I must assume it's valid EXDF?
In which way are those files not readable now? I must assume it's valid EXDF?
They already can be read, sorry for not making it clear. But it would be nice to stack all of them into coordinates, unless you can suggest other layouts
Ah, thanks for the clarification. This is data interpretation though, not data reading. For this, the extra
package (and their components) seems like the right place. As it happens, something similar is being worked on for DLD detector data reconstructed from digitizers directly.
Great. Then, we can close this issue.
Components are specific to the device they cater to, I did not mean to imply this will satisfy your case. Here, one would need an appropriate component interpreting data from the TDC device. Could you please copy this issue over to its repository?
So reading it at the moment would look something like this:
mcp = run['SXP_MCP/MCP/MCP_BOUND_TEST:output']
x = mcp['data.x'].ndarray()
y = mcp['data.y'].ndarray()
t = mcp['data.t'].ndarray()
I.e. you get multiple arrays which you can use by indexing them in the same way.
David was asking for something like an xarray Dataset, which is a convenient way to work with a group of arrays sharing common axes.
The obvious way to do this at present is a component in EXtra, our new package built on top of EXtra-data. So you would do something like:
mcp = MCPDetector(run)
mcp.get_dataset(['x', 'y', 't'])
If there are likely to be more devices with a similar style of output, we could also consider adding a generic API in EXtra-data, something like:
mcp = run['SXP_MCP/MCP/MCP_BOUND_TEST:output']
mcp.xarray_multi('data.*')
as already discussed with @takluyver, it would be nice to be able to read files recorded by an MCP-DLD that are processed by a TDC.
The path and the data types can be seen in the following screenshot
where 500 is the trainId chunked by the DAQ and 1000 is the fixed size array needed by DAQ. From those 1000 columns, only N of them would have meaningful data, where N is given by the number of events calculated (not necessarily the same as the number of pulses) by the TDC and can vary from train to train. The rest of the columns are just zeros and could be deleted.
In the following path, one can find a couple of files with "ideally perfect" data (arrays
ADC
andtimeTag
will not be present in the data anymore):/gpfs/exfel/exp/SXP/202331/p900367/raw/r0078
Let me know if you need more details