Closed takluyver closed 8 months ago
Hi @takluyver , thank you for adding this functionality, it is very useful for validating data reduction results. Regarding you questions:
- It could be the opposite, telling you where data is missing, if we think we'll want that more often. Flipping it with the ~ operator is easy in either case.
I like the current approach more, with True
for available data.
- It could give you results by train (16, 200) with either booleans for present/absent, or counts of how many frames are in the train (the count should be the same across modules, except where it's 0).
No, I think the current approach with data per frame is better.
- We could also expose it on the detector object so agipd.data_availability() is the same as agipd['image.data'].data_availability() (using the _main_data_key).
Yes, such simplification will very convenient - could you please add it? I think it would make sense to keep both interfaces, agipd[key].data_availability() for any data key, and agipd.data_availability() for the main data key.
Actually, would it make sense to flip the dimensions and have the frame index first? E.g. (8000, 16).
Actually, would it make sense to flip the dimensions and have the frame index first? E.g. (8000, 16).
My reasoning for doing it this way round is that if you do e.g. agipd['image.data'].ndarray()
, you would get an array of (16, 8000, 512, 128), so this matches the order for the relevant dimensions. Having dimensions in a predictable order seems like a pretty good idea, and it's easy to transpose them with Numpy if you want the other order. Do you think the (8000, 16) shape will be the one we want so often to override that?
Thanks Oleksii! I've added the method to the detector classes now as well, along with a test for the behaviour.
Thank you Thomas!
Do you think the (8000, 16) shape will be the one we want so often to override that?
No, I think it is fine. I was just going from the direction that in the VDS files we have e.g.: data [float32: 155955 × 16 × 512 × 128]
LGTM!
@turkot this is something we were discussing yesterday.
If you have 200 trains of data with 40 frames per train recorded by an AGIPD-1M detector (for instance), this would give you a (16, 8000) array, with True for frames that are present, and False for any that are missing.
Open questions:
~
operator is easy in either case.agipd.data_availability()
is the same asagipd['image.data'].data_availability()
(using the_main_data_key
).