tlambert03 / nd2

Full-featured nd2 (Nikon NIS Elements) file reader for python. Outputs to numpy, dask, and xarray. Exhaustive metadata extraction
https://tlambert03.github.io/nd2
BSD 3-Clause "New" or "Revised" License
53 stars 15 forks source link

only metadata #71

Closed Heerpa closed 2 years ago

Heerpa commented 2 years ago

Description

Is there a way to open a file only for reading metadata, without the overhead generated by preparing to read image data?

tlambert03 commented 2 years ago

how are you currently doing it? The following should be minimal:

import nd2

with nd2.ND2File('some_file.nd2') as f:
    f.attributes        # nd2.structures.Attributes
    f.metadata          # nd2.structures.Metadata
    f.experiment        # List[nd2.structures.ExpLoop]
    f.text_info         # dict of misc info
    f.custom_data       # mishmash of data extracted from file
Heerpa commented 2 years ago

This basically how I do it, just without the context manager:

import nd2

f = nd2.ND2File('some_file.nd2')
f.attributes        # nd2.structures.Attributes
f.metadata          # nd2.structures.Metadata
f.experiment        # List[nd2.structures.ExpLoop]
f.text_info         # dict of misc info
f.custom_data       # mishmash of data extracted from file
f.close()

Opening the (large) file takes a few seconds, so I was assuming that image data is being prepared there already. Happy to close the issue if that's not the case.

tlambert03 commented 2 years ago

Oh I just noticed you’re on 0.2.2, can you update and try again?

Heerpa commented 2 years ago

Yes, with 0.2.5, opening the file takes a fraction of a second instead of multiple seconds. Thanks very much!

tlambert03 commented 2 years ago

excellent! you might also try, with v0.3.0:

import nd2

with nd2.ND2File(..., read_using_sdk=True):
    # get your metadata

I'd be curious to hear what you find there regarding performance.

Backstory: This library actually contains both a wrapper around the official nd2 sdk (which is why all the metadata comes out so nice), but also a direct chunkmap-inspection and numpy-memmap reader. The latter was introduced when I found what seemed to be performance benefits of not using the sdk for the actual reading process.

However, that "greedy" inspection of the chunkmap was what led to your original performance hit observations. While that was fixed in 0.2.5, the new read_using_sdk parameter in v0.3.0 essentially opts out of all non-sdk logic (i.e. it doesn't even open a separate file handle for any additional inspection). So, you might even see an additional bump for pure-metadata purposes there. But, if my previous observations are correct, you might see a slight hit on actual data read speed... but I'm not sure :)

... if we eventually find that read_using_sdk is better in all cases (i.e. my original observations were wrong somehow), then we can make that the default again

Heerpa commented 2 years ago

cool, thanks. Yes I just tried 0.3.0 and get the following results (tested the script in duplicate): And weirdly, both file opening and reading are faster when read_using_sdk is False

imported packages in 18.54 seconds.
testing read speed of nd2 file using nd2reader
File loaded in 0.1 seconds.
100%|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 30000/30000 [00:46<00:00, 650.62frame/s]
Loaded all frames in 650.62 frames per second.
testing read speed of nd2 file via nd2 package directly. Using SDK:  True
File loaded in 3.41 seconds.
100%|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 30000/30000 [1:17:12<00:00,  6.48frame/s]
Loaded all frames in 6.48 frames per second.
testing read speed of nd2 file via nd2 package directly. Using SDK:  False
File loaded in 0.26 seconds.
100%|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 30000/30000 [28:31<00:00, 17.53frame/s]
Loaded all frames in 17.53 frames per second.

>python test_readspeed.py
imported packages in 15.29 seconds.
testing read speed of nd2 file using nd2reader
File loaded in 0.09 seconds.
100%|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 30000/30000 [00:35<00:00, 836.36frame/s]
Loaded all frames in 836.36 frames per second.
testing read speed of nd2 file via nd2 package directly. Using SDK:  True
File loaded in 2.18 seconds.
100%|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 30000/30000 [1:34:05<00:00,  5.31frame/s]
Loaded all frames in 5.31 seconds per frame.
testing read speed of nd2 file via nd2 package directly. Using SDK:  False
File loaded in 0.27 seconds.
100%|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 30000/30000 [32:03<00:00, 15.60frame/s]
Loaded all frames in 15.60 seconds per frame.