wkcuber multiprocessing much slower or broken in 0.12.3

elhuhdron commented 1 year ago

Context

Affected library: wkcuber I use the wkcuber CLI to convert small or medium sized tiff stacks to wkw. I tried upgrading to the latest version (0.12.3) and it's not working as before; either it's significantly slower or is completely broken.

Expected Behavior

I have a tiff stack of uint8 images: count=4490 size=6055x6428. Size on disk is about 163G.

I'm calling it on command line with the intention of using multiprocessing on a 72 core machine, with the following command line:

python -m wkcuber --jobs 72 --voxel_size 64,64,35 --max_mag 64 --sampling_mode constant_z --name ZF-No2-retina-no-rough /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/ZF-No2-retina-no-rough

Current Behavior

In version 0.9.21 the progress bar proceeds normally and mag1 cubes (before compression) in 3.5 minutes:

Cubing from 0 to 4490 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:03:33 | 0:00:00

In version 0.12.3 it keeps producing the following output on multiple lines without finishing (I let it run for 15 minutes):

Cubing from 0 to 4490 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:24 | 0:00:00
Cubing from 0 to 4490 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% 0:00:03 | -:--:--/gpfs/soma_fs/home/watkins/miniconda3/envs/wkw-new/lib/python3.9/site-packages/webknossos/dataset/view.py:111: DeprecationWarning: [DEPRECATION] view.size is deprecated. Since this is a View, please use view.bounding_box.in_mag(view.mag).size instead.
  warnings.warn(
/gpfs/soma_fs/home/watkins/miniconda3/envs/wkw-new/lib/python3.9/site-packages/webknossos/dataset/view.py:100: DeprecationWarning: [DEPRECATION] view.global_offset is deprecated. Since this is a View, please use view.bounding_box.in_mag(view.mag).topleft instead.
  warnings.warn(

Steps to Reproduce the bug

Conversion of a simple tiff stack using multiprocessing in the two versions should reproduce the issue.

Your Environment for bug

  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 3.10.0-1160.el7.x86_64
      Architecture: x86-64

Python 3.9.16

hotzenklotz commented 1 year ago

@elhuhdron Thanks for the bug report.

A quick question: Are you running this via multiprocessing on a workstation or distributed through a cluster via SLURM?

fm3 commented 1 year ago

Thanks for the report! We will look into this. In the meantime, could you test if executing this little python script does the job for you? (I pasted the paths from your command, please double-check that everything looks right. You may also need to remove the half-converted output dataset from wkcuber first).

from cluster_tools import WrappedProcessPoolExecutor
import webknossos as wk

def main():
    ds = wk.Dataset.from_images(
        input_path = "/gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs",
        output_path = "/gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/ZF-No2-retina-no-rough",
        voxel_size=(64,64,35),
        name="ZF-No2-retina-no-rough",
        executor=WrappedProcessPoolExecutor(max_workers=72)
    )
    ds.compress(executor=WrappedProcessPoolExecutor(max_workers=72))
    ds.downsample(sampling_mode="constant_z", executor=WrappedProcessPoolExecutor(max_workers=72))

if __name__ == "__main__":
    main()

elhuhdron commented 1 year ago

@elhuhdron Thanks for the bug report.

A quick question: Are you running this via multiprocessing on a workstation or distributed through a cluster via SLURM?

I'm running on a cluster, but interactively from a bash prompt on a single node with 72 cores. In the past I had found that for smaller image stack conversion, the overhead of the jobs via SLURM submission (at least on our cluster) actually makes the conversion slower, so I found this solution better for smaller stacks (anything less than about 1TB total).

elhuhdron commented 1 year ago

Thanks for the report! We will look into this. In the meantime, could you test if executing this little python script does the job for you? (I pasted the paths from your command, please double-check that everything looks right. You may also need to remove the half-converted output dataset from wkcuber first).

from cluster_tools import WrappedProcessPoolExecutor
import webknossos as wk

def main():
    ds = wk.Dataset.from_images(
        input_path = "/gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs",
        output_path = "/gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/ZF-No2-retina-no-rough",
        voxel_size=(64,64,35),
        name="ZF-No2-retina-no-rough",
        executor=WrappedProcessPoolExecutor(max_workers=72)
    )
    ds.compress(executor=WrappedProcessPoolExecutor(max_workers=72))
    ds.downsample(sampling_mode="constant_z", executor=WrappedProcessPoolExecutor(max_workers=72))

if __name__ == "__main__":
    main()

Ha! Indeed, with version 0.12.3 this works in about the same amount of time as in the previous version:

/gpfs/soma_fs/home/watkins/miniconda3/envs/wkw-new/lib/python3.9/site-packages/pims/api.py:204: UserWarning: <class 'webknossos.dataset._utils.pims_imagej_tiff_reader.PimsImagejTiffReader'> errored: /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs/wafer02_order00943_0232_S232R232_stitched_grid_thumbnail.tiff is not an ImageJ Tiff
  warn(message)
/gpfs/soma_fs/home/watkins/miniconda3/envs/wkw-new/lib/python3.9/site-packages/pims/api.py:204: UserWarning: <class 'webknossos.dataset._utils.pims_imagej_tiff_reader.PimsImagejTiffReader'> errored: /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs/wafer02_order00943_0232_S232R232_stitched_grid_thumbnail.tiff is not an ImageJ Tiff
  warn(message)
Creating layer from images ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:03:45 | 0:00:00

It did initially complain that I needed to install webknossos[all] which previously I had not done. I had only installed via pip install wkcuber. Is this important for the CLI now to install the full version?

I would however prefer to continue to use the CLI, if possible, but this is a good workaround for now, thank you.

fm3 commented 1 year ago

with version 0.12.3 this works in about the same amount of time as in the previous version

Good to hear!

The wkcuber CLI still internally uses our “old” implementation (so it does not currently need/make use of webknossos[all]). Apparently this “old” implementation is currently broken in the way you initially reported. However, we plan to change the wkcuber CLI to internally use similar code to what I suggested to you above. However, this is not released yet.

fm3 commented 1 year ago

With the new CLI the command would be something like this

webknossos convert --jobs 72 --voxel_size 64,64,35 --name ZF-No2-retina-no-rough /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/ZF-No2-retina-no-rough

@markbader @normanrz do you know if this includes downsampling? And if so, how to pass the sampling_mode?

elhuhdron commented 1 year ago

I have this working now; it seems relative to the old cuber that submitting via slurm is faster on our cluster again. Additionally it is orders of magnitude slower if --compress is specified to convert. I've settled on essentially the solution that you suggest, convert then compress and then downsample (specifying the sampling_mode to downsample). This works fine for me now using the most recent master branch.

scalableminds / webknossos-libs