Closed danielballan closed 8 years ago
@danielballan The shape of the data that @yugangzhang is getting from filestore is 'shape': [3000, 2167, 2070],
. the EigerImages pims reader is correctly returning the lazily loaded data set. The problem is that there are only two data points in the data set and each are 3000x2167x2070. I'm pretty sure that we would need to write a new handler that would lazily load each of the 3000 frames independently
If you already knew that, sorry :cry:
OK, that sounds right. h5py makes it straightforward to load partial data sets, so we can certainly write a lazier handler.
@ericdill that's what I meant.
@yugangzhang See how far you get with this, reading partial data sets. Read the documentation for h5py.
?
Thanks! It helps.
@sameera2004 reports that you are disappointed in the slowness of these two lines:
imgs = get_images(...)
imgs[0]
Just to be sure we're on the same page, this is slow because each individual frame of imgs
is a huge cube, ~ 1000^3 pixels, so even accessing the very first frame is costly.
I suggested a LazyEigerHandler
that can load partial frames. I think you, @yugangzhang, can lead the way on this. I'm happy to provide more guidance as needed.
I will go to this issue soon.
@heroux This is also of interest to the *MX folks who have eigers, please forward this to the correct people on those beamlines.
I wrote a working prototype that @sameera2004 has in some of her notebooks. We will give it a permanent home in the NSLS-II/eiger-io repo.
Sameera, I originally said I would do this but I'm swamped. Any chance you could make a PR out of the code in your notebook? It can be pasted as-is, I think, into eiger_io/fs_handlers. Then I will take a look at it.
@danielballan sure I will create a PR
@danielballan I created a PR#4 in eiger_io/fs_handlers https://github.com/NSLS-II-CHX/eiger-io/pull/4
On Wed, 4 Nov 2015, Thomas A Caswell wrote:
@heroux This is also of interest to the *MX folks who have eigers, please forward this to the correct people on those beamlines.
probably not :(
Existing MX packages all take a file prefix/pattern and read directly from the filesystem. Unless we're able to exceed the performance of whatever san/nas filesystem is used, it would just be slowing things down.
New MX packages are starting to look at being able to consume streams, but the point of that is to get a stream directly from the detector to minimize latency.
Only place I see this maybe benefiting MX is in the backend of the SynchWeb knockoff? But that'll be fetching jpg's of images via FileStore?
-matt
This is an issue about handlers for images from Eiger detectors. We could use filestore to index a collection JPEGs, but that would use a JPEG handler, nothing to do with Eiger, right?
uh, I think so?
I was just responding about the lazy loading stuff being interesting for MX.
Quoting an email from @yugangzhang
Immediate reactions:
from eiger_io.pims_reader import EigerImages as Images
. Why are you doing that? There isfrom databroker import get_images
.np.array
. Simplify removing that call will do what I think @yugangzhang wants.