Closed elhuhdron closed 11 months ago
@elhuhdron Thanks for the bug report.
A quick question: Are you running this via multiprocessing on a workstation or distributed through a cluster via SLURM?
Thanks for the report! We will look into this. In the meantime, could you test if executing this little python script does the job for you? (I pasted the paths from your command, please double-check that everything looks right. You may also need to remove the half-converted output dataset from wkcuber first).
from cluster_tools import WrappedProcessPoolExecutor
import webknossos as wk
def main():
ds = wk.Dataset.from_images(
input_path = "/gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs",
output_path = "/gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/ZF-No2-retina-no-rough",
voxel_size=(64,64,35),
name="ZF-No2-retina-no-rough",
executor=WrappedProcessPoolExecutor(max_workers=72)
)
ds.compress(executor=WrappedProcessPoolExecutor(max_workers=72))
ds.downsample(sampling_mode="constant_z", executor=WrappedProcessPoolExecutor(max_workers=72))
if __name__ == "__main__":
main()
@elhuhdron Thanks for the bug report.
A quick question: Are you running this via multiprocessing on a workstation or distributed through a cluster via SLURM?
I'm running on a cluster, but interactively from a bash prompt on a single node with 72 cores. In the past I had found that for smaller image stack conversion, the overhead of the jobs via SLURM submission (at least on our cluster) actually makes the conversion slower, so I found this solution better for smaller stacks (anything less than about 1TB total).
Thanks for the report! We will look into this. In the meantime, could you test if executing this little python script does the job for you? (I pasted the paths from your command, please double-check that everything looks right. You may also need to remove the half-converted output dataset from wkcuber first).
from cluster_tools import WrappedProcessPoolExecutor import webknossos as wk def main(): ds = wk.Dataset.from_images( input_path = "/gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs", output_path = "/gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/ZF-No2-retina-no-rough", voxel_size=(64,64,35), name="ZF-No2-retina-no-rough", executor=WrappedProcessPoolExecutor(max_workers=72) ) ds.compress(executor=WrappedProcessPoolExecutor(max_workers=72)) ds.downsample(sampling_mode="constant_z", executor=WrappedProcessPoolExecutor(max_workers=72)) if __name__ == "__main__": main()
Ha! Indeed, with version 0.12.3 this works in about the same amount of time as in the previous version:
/gpfs/soma_fs/home/watkins/miniconda3/envs/wkw-new/lib/python3.9/site-packages/pims/api.py:204: UserWarning: <class 'webknossos.dataset._utils.pims_imagej_tiff_reader.PimsImagejTiffReader'> errored: /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs/wafer02_order00943_0232_S232R232_stitched_grid_thumbnail.tiff is not an ImageJ Tiff
warn(message)
/gpfs/soma_fs/home/watkins/miniconda3/envs/wkw-new/lib/python3.9/site-packages/pims/api.py:204: UserWarning: <class 'webknossos.dataset._utils.pims_imagej_tiff_reader.PimsImagejTiffReader'> errored: /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs/wafer02_order00943_0232_S232R232_stitched_grid_thumbnail.tiff is not an ImageJ Tiff
warn(message)
Creating layer from images ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:03:45 | 0:00:00
It did initially complain that I needed to install webknossos[all]
which previously I had not done. I had only installed via pip install wkcuber
. Is this important for the CLI now to install the full version?
I would however prefer to continue to use the CLI, if possible, but this is a good workaround for now, thank you.
with version 0.12.3 this works in about the same amount of time as in the previous version
Good to hear!
The wkcuber CLI still internally uses our “old” implementation (so it does not currently need/make use of webknossos[all]
). Apparently this “old” implementation is currently broken in the way you initially reported. However, we plan to change the wkcuber CLI to internally use similar code to what I suggested to you above. However, this is not released yet.
With the new CLI the command would be something like this
webknossos convert --jobs 72 --voxel_size 64,64,35 --name ZF-No2-retina-no-rough /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/thumbnails_solved_order_ds4-tiffs /gpfs/soma_fs/cne-mSEM/mSEM-proc/2021/briggman/Retina_zebrafish_No2/meta/webknossos/ZF-No2-retina-no-rough
@markbader @normanrz do you know if this includes downsampling? And if so, how to pass the sampling_mode
?
I have this working now; it seems relative to the old cuber that submitting via slurm is faster on our cluster again. Additionally it is orders of magnitude slower if --compress
is specified to convert
. I've settled on essentially the solution that you suggest, convert
then compress
and then downsample
(specifying the sampling_mode
to downsample
). This works fine for me now using the most recent master branch.
Context
Expected Behavior
I have a tiff stack of uint8 images:
count=4490 size=6055x6428
. Size on disk is about 163G.I'm calling it on command line with the intention of using multiprocessing on a 72 core machine, with the following command line:
Current Behavior
In version 0.9.21 the progress bar proceeds normally and mag1 cubes (before compression) in 3.5 minutes:
In version 0.12.3 it keeps producing the following output on multiple lines without finishing (I let it run for 15 minutes):
Steps to Reproduce the bug
Conversion of a simple tiff stack using multiprocessing in the two versions should reproduce the issue.
Your Environment for bug