peng-lab / BaSiCPy

MIT License
61 stars 20 forks source link

Memory issue #127

Open drippypale opened 1 year ago

drippypale commented 1 year ago

Hi, I'm trying to call the fit() on a relatively large tensor of images, and considering the size of the image tensor itself, my RAM(24GB) gets filled and leads to killing the process by the OS.

  1. Can I call the fit() on multiple small chunks of my images? (I mean calling the fit method of the same BaSiC object each time on a chunk.)
  2. Do you have any suggestions on what I should do to resolve the memory(RAM) problem?
image_list = skimage.io.imread_collection((f'{input_path}/' + channel_files[channel_files.str.startswith(plate)]).to_list()).concatenate()

basic = basicpy.BaSiC(get_darkfield=True, smoothness_flatfield=1)
basic.fit(images=image_list)

images_transformed = basic.transform(image_list, timelapse=False)
yfukai commented 1 year ago

Hi @drippypale, thank you for your interest in BaSiCPy! I'm sorry for the late response.

Can I call the fit() on multiple small chunks of my images? (I mean calling the fit method of the same BaSiC object each time on a chunk.)

It would be great if we can run the algorithm for small chunks (and I have some idea for that), but currently, this is not implemented. (the state would be reset at each run). We'll discuss this incremental fitting.

Do you have any suggestions on what I should do to resolve the memory(RAM) problem?

Actually, the package uses a "shrunk" version of the images (to the working_size, 128x128 by default). This process can be done lazily by the dask package. Can you try BaSiC(resize_mode="skimage_dask") to see if it solves the problem? https://basicpy.readthedocs.io/en/latest/api.html#basicpy.basicpy.BaSiC.resize_mode

yfukai commented 1 year ago

Sorry mistakenly closed, reopening.