tlambert03 / nd2

Full-featured nd2 (Nikon NIS Elements) file reader for python. Outputs to numpy, dask, and xarray. Exhaustive metadata extraction
https://tlambert03.github.io/nd2
BSD 3-Clause "New" or "Revised" License
54 stars 15 forks source link

read speed #70

Closed Heerpa closed 2 years ago

Heerpa commented 2 years ago

Description

I have long microscopy movies - e.g. 30000 frames, otherwise only dimensions X and Y. All frames are analysed independently.
For reading .nd2 files, I've tested nd2reader, and nd2 (which I'd prefer because of better access to the metadata). using the nd2reader, I can read at a speed of about 800 frames/second (single-process) using nd2, I get 20 frames/second (when converting frames from dask to numpy using .compute()). Only loading the frames as dask arrays, I get 2000 frames/second - but of course I need to do the computations in numpy.

Is there a way to load planes faster? I don't have enough memory to directly load the complete dataset.

What I Did

import time, sys, os
tic = time.time()
import numpy as np
import nd2
from nd2reader import ND2Reader
import yaml
from tqdm import tqdm

def test_readspeed_nd2direct():
    print('testing read speed of nd2 file via nd2 package directly.')
    t00 = time.time()

    movie = nd2.ND2File(filename_nd2).to_dask()

    dt = np.round(time.time() - t00, 2)
    print(f"File loaded in {dt} seconds.")

    tic = time.time()
    n_frames = len(movie)
    with tqdm(total=n_frames, unit="frame") as progress_bar:
        for i in range(n_frames):
            frame = movie[i].compute()
            progress_bar.update()
    print('Loaded all frames in {:.2f} seconds per frame.'.format(
        (n_frames/(time.time()-tic))))

def test_readspeed_nd2reader():
    print('testing read speed of nd2 file using nd2reader')
    t00 = time.time()

    # with ND2Reader(filename_nd2) as movie:
    movie = ND2Reader(filename_nd2)
    dt = np.round(time.time() - t00, 2)
    print(f"File loaded in {dt} seconds.")

    tic = time.time()
    n_frames = len(movie)
    with tqdm(total=n_frames, unit="frame") as progress_bar:
        for i in range(n_frames):
            frame = movie[i]
            progress_bar.update()
    print('Loaded all frames in {:.2f} frames per second.'.format(
        (n_frames/(time.time()-tic))))
    movie.close()

if __name__ == '__main__':
    print('imported packages in {:.2f} seconds.'.format(
        time.time()-tic))

    test_readspeed_nd2reader()
    test_readspeed_nd2direct()

output

imported packages in 15.89 seconds.
testing read speed of nd2 file using nd2reader
File loaded in 0.06 seconds.
100%|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 30000/30000 [00:39<00:00, 753.40frame/s]
Loaded all frames in 753.22 frames per second.
testing read speed of nd2 file via nd2 package directly.
File loaded in 3.09 seconds.
  7%|¦¦¦¦¦¦¦?                                                                                                | 2109/30000 [02:05<24:28, 19.00frame/s]
tlambert03 commented 2 years ago

similar to #71, can you try updating nd2 and see if you get an improvement? Some things got better in https://github.com/tlambert03/nd2/pull/51 (which is >=0.2.3). I wouldn't expect that to fully solve everything, but perhaps a little gain

tlambert03 commented 2 years ago

assuming this is fixed like #71