Blogpost using Dask Array with ITK

mrocklin commented 5 years ago

We've run into some people who use ITK a bit on imaging data. A blogpost that showed a simple-yet-realistic workflow with Dask Array and ITK together would probably help onboard these groups more effectively.

In conversation with @jakirkham and Gokul we were thinking about something like the following:

Read in some data #26
Deskew some imaging data coming off of an advanced microscope
Deconvolve that data to sharpen things up a bit
Run segmentation to get out regions of interest

This workflow is far from set in stone. I suspect that things will quickly change when someone tries this on real data, but it might provide an initial guideline. Also, as @jakirkham brought up, it might be worth splitting this up into a few smaller posts.

cc @thewtex for visibility. I think that he has also been thinking about these same problems of using Dask to scale out ITK workloads on large datasets.

jakirkham commented 5 years ago

Thanks for raising this Matt.

guillaumeeb commented 5 years ago

Just chiming in here :

At CNES, we develop a library called Orfeo ToolBox for processing satellite imagery which is heavily based on ITK.

Using Dask to scale and simplify the use of OTB is something we had in mind for several months. I think currently we've not been able to think of easy way to do it as soon as the processing becomes a bit complex, so we could probably benefit a lot from experience with ITK. I might again be a little out of scope here, and we should perhaps raise the issue somewhere else.

ccing @jmichel-otb again, which has probably a more clear vision about all this.

jakirkham commented 5 years ago

Maybe these posts would be useful for guiding that work. Are the operations listed above representative of use cases you have? Are there other things you are doing?

thewtex commented 5 years ago

@mrocklin @jakirkham I will certainly help with this! :-)

@guillaumeeb @jmichel-otb I wonder if we can get ITK's Python wrapping and packaging infrastructure applied to OTB filters so we can use filters in distributed processing pipelines with Dask?

@jakirkham your first post looks outstanding :clap: . The more and more I learn about zarr, and I am increasingly excited, and I will add zarr support to ITK :construction_worker_man: .

mrocklin commented 5 years ago

Thanks for the offer to help @thewtex . We readily accept :)

I think that with the image data in the first blogpost the two things that @rxist525 mentioned would be nice would be RL deconvolution, followed up by segmentation. I got the sense that these are well handled by routines in ITK today. My guess is that this work can be broken up into two pieces:

Find the right algorithms within ITK, learn how to apply them onto Numpy arrays in Python, and then use a Dask Array's map_blocks method to apply them onto many images in the image stack, assuming that they are embarassingly parallel, or map_overlapif they need some halo. My understanding is that for the data described in the image loading blogpost, the embarassingly parallel map_blocks method should work fine (though I'm far from an expert here).

My guess is that this first step could be done without access to the data, perhaps even using some random dataset with the same structure, like the following:
```
import dask.array as da
da.random.randint(0, 2**16-1, size=(3, 199, 201, 1024, 768), chunks=(1, 1, None, None, None), dtype='uint16')
```
Do this same computation on the actual data on some parallel machine, and look at the scientific results to verify that they're meaningful.

My guess is that @thewtex (or someone close by) will be particularly well qualified to do the first part, of finding the right operations to use and stitching them together effectively while any of @thewtex or @jakirkham could do the second bit.

@thewtex @jakirkham any thoughts on the structure above? What is the right way to start work on this? Personally, I'd love to have a blogpost out about this well before the SciPy conference, where there seems to be a concentration of imaging work being discussed. If we got this out well beforehand that would give us some space to also look into GPU acceleration, which is an obvious interest of mine these days.

thewtex commented 5 years ago

@mrocklin sounds like a great plan! :world_map:

I will start on 1.). I downloaded @rxist525's data yesterday, and I will first reproduce @jakirkham 's post. We have RL deconvolution in ITK, and we can apply that as a first step. @rxist525 @jakirkham do you have suggestions for the deconvolution kernel?

rxist525 commented 5 years ago

@thewtex - great question. We have experimentally measured PSFs for the data I uploaded, I'll add them to the directory. For the decon kernel, This can be used as is or after calculating the OTF (depending on the input of the RL decon function in ITK). @jakirkham @mrocklin we are currently using this RLcudaDecon implementation, but anticipate needing the CPU version (based on the limitations fo the GPU memory).

mrocklin commented 5 years ago

Personally I'm fine focusing on CPU code first and just exploring Dask Array + CPU code on imaging data.

I think that after we have a couple of clearly-valuable workflows then we can look at what can be done to accelerate them with GPUs.

On Sat, Jun 22, 2019 at 7:43 AM Gokul U notifications@github.com wrote:

@thewtex https://github.com/thewtex - great question. We have experimentally measured PSFs for the data I uploaded, I'll add them to the directory. For the decon kernel, This can be used as is or after calculating the OTF (depending on the input of the RL decon function in ITK). @jakirkham https://github.com/jakirkham @mrocklin https://github.com/mrocklin we are currently using this RLcudaDecon implementation https://github.com/dmilkie/cudaDecon, but anticipate needing the CPU version (based on the limitations fo the GPU memory).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=AACKZTG55E5U2SIPF3QGF4TP3W3XJA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYKBNMI#issuecomment-504633009, or mute the thread https://github.com/notifications/unsubscribe-auth/AACKZTA4XEKFFMRINXEKYCDP3W3XJANCNFSM4HR54PWQ .

thewtex commented 5 years ago

@thewtex - great question. We have experimentally measured PSFs for the data I uploaded, I'll add them to the directory.

@rxist525 awesome! I am having a hard time finding the PSF files -- is there a direct link?

rxist525 commented 5 years ago

Oops, I dropped the ball- I will upload tonight and also send a note with the link.

On Mon, Jun 24, 2019, 2:23 PM Matt McCormick notifications@github.com wrote:

@thewtex https://github.com/thewtex - great question. We have experimentally measured PSFs for the data I uploaded, I'll add them to the directory.

@rxist525 https://github.com/rxist525 awesome! I am having a hard time finding the PSF files -- is there a direct link?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=ACE3CWNBHP7XHIOHL46T2W3P4EGKTA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYNZK2Y#issuecomment-505124203, or mute the thread https://github.com/notifications/unsubscribe-auth/ACE3CWMYMW2B5IDFO4TOFU3P4EGKTANCNFSM4HR54PWQ .

thewtex commented 5 years ago

Thanks!

rxist525 commented 5 years ago

Just uploaded the psfs for the three channels, and they can be found here https://drive.google.com/open?id=13udO-h9epItG5MNWBp0VxBkKCllYBLQF. The z-step for the psf is 0.1um; the samples were imaged at 0.2um z-step.

On Mon, Jun 24, 2019 at 2:37 PM Matt McCormick notifications@github.com wrote:

Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=ACE3CWM6LQDYNF7KJRYGMQLP4EH5LA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYN2PZI#issuecomment-505128933, or mute the thread https://github.com/notifications/unsubscribe-auth/ACE3CWPGZ2BZ7RKSJU45G73P4EH5LANCNFSM4HR54PWQ .

jakirkham commented 5 years ago

@thewtex, I tried performing Richardson-Lucy deconvolution using ITK and Dask on the AOLLSM dataset. Here's a rough notebook. Please let me know what you think.

rxist525 commented 5 years ago

How does the output look like? The voxel size z is different between the data and the PSF is this info in the header data?

On Mon, Jul 1, 2019 at 12:38 PM jakirkham notifications@github.com wrote:

@thewtex https://github.com/thewtex, I tried performing Richardson-Lucy deconvolution using ITK and Dask on the AOLLSM dataset. Here's a rough notebook https://gist.github.com/jakirkham/c40d58b3c9c1fbac04894091bc6c5b3f. Please let me know what you think.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=ACE3CWKWQL5Q54RSLMQIKLTP5JMKNA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY7EGXY#issuecomment-507396959, or mute the thread https://github.com/notifications/unsubscribe-auth/ACE3CWJGHUNL3R3RJHI5HPTP5JMKNANCNFSM4HR54PWQ .

jakirkham commented 5 years ago

Thanks for pointing that out.

Right now I'm actually running into some issues earlier on. For instance I'm not able to pickle parts of ITK (see below). This is important for Dask to distribute tasks to multiple workers. Am attempting some workarounds (local imports), but run into other import issues that I've yet to reduce to MCVEs.

xref: https://github.com/InsightSoftwareConsortium/ITK/issues/1050 xref: https://github.com/InsightSoftwareConsortium/ITK/issues/1051

mrocklin commented 5 years ago

We could also start with the in-process threaded scheduler for a first pass?

On Mon, Jul 1, 2019 at 9:34 PM jakirkham notifications@github.com wrote:

Thanks for pointing that out.

Right now I'm actually running into some issues earlier on. For instance I'm not able to pickle parts of ITK (see below). This is important for Dask to distribute tasks to multiple workers. Am attempting some workarounds (local imports), but run into other import issues that I've yet to reduce to MCVEs.

xref: InsightSoftwareConsortium/ITK#1050 https://github.com/InsightSoftwareConsortium/ITK/issues/1050 xref: InsightSoftwareConsortium/ITK#1051 https://github.com/InsightSoftwareConsortium/ITK/issues/1051

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=AACKZTDPS6W4S7A7NA2E4VDP5JS5RA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY7IXIA#issuecomment-507415456, or mute the thread https://github.com/notifications/unsubscribe-auth/AACKZTC6PN3OXWAONU3G6ATP5JS5RANCNFSM4HR54PWQ .

jakirkham commented 5 years ago

Initially I tried working in memory without Dask, which is where this function comes from. That worked ok. It's a bit slow, but I could also be doing things incorrectly. Advice from someone more familiar with ITK would be welcome. 😉

jakirkham commented 5 years ago

Including this traceback here in case someone spots something (even though I haven't found an MCVE). This is reproducible with the notebook and the AOLLSM dataset.

```python --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in ----> 1 imgs_deconv.to_zarr("/public/AOLLSMData_m4_raw.zarr/", "deconvolved", overwrite=True) ~/miniconda/envs/img_exp/lib/python3.7/site-packages/dask/array/core.py in to_zarr(self, *args, **kwargs) 2161 See function ``to_zarr()`` for parameters. 2162 """ -> 2163 return to_zarr(self, *args, **kwargs) 2164 2165 def to_tiledb(self, uri, *args, **kwargs): ~/miniconda/envs/img_exp/lib/python3.7/site-packages/dask/array/core.py in to_zarr(arr, url, component, storage_options, overwrite, compute, return_stored, **kwargs) 2743 **kwargs 2744 ) -> 2745 return arr.store(z, lock=False, compute=compute, return_stored=return_stored) 2746 2747 ~/miniconda/envs/img_exp/lib/python3.7/site-packages/dask/array/core.py in store(self, target, **kwargs) 1227 @wraps(store) 1228 def store(self, target, **kwargs): -> 1229 r = store([self], [target], **kwargs) 1230 1231 if kwargs.get("return_stored", False): ~/miniconda/envs/img_exp/lib/python3.7/site-packages/dask/array/core.py in store(sources, targets, lock, regions, compute, return_stored, **kwargs) 865 866 if compute: --> 867 result.compute(**kwargs) 868 return None 869 else: ~/miniconda/envs/img_exp/lib/python3.7/site-packages/dask/base.py in compute(self, **kwargs) 173 dask.base.compute 174 """ --> 175 (result,) = compute(self, traverse=False, **kwargs) 176 return result 177 ~/miniconda/envs/img_exp/lib/python3.7/site-packages/dask/base.py in compute(*args, **kwargs) 444 keys = [x.__dask_keys__() for x in collections] 445 postcomputes = [x.__dask_postcompute__() for x in collections] --> 446 results = schedule(dsk, keys, **kwargs) 447 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)]) 448 ~/miniconda/envs/img_exp/lib/python3.7/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs) 2507 should_rejoin = False 2508 try: -> 2509 results = self.gather(packed, asynchronous=asynchronous, direct=direct) 2510 finally: 2511 for f in futures.values(): ~/miniconda/envs/img_exp/lib/python3.7/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous) 1798 direct=direct, 1799 local_worker=local_worker, -> 1800 asynchronous=asynchronous, 1801 ) 1802 ~/miniconda/envs/img_exp/lib/python3.7/site-packages/distributed/client.py in sync(self, func, *args, **kwargs) 761 return future 762 else: --> 763 return sync(self.loop, func, *args, **kwargs) 764 765 def __repr__(self): ~/miniconda/envs/img_exp/lib/python3.7/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs) 332 e.wait(10) 333 if error[0]: --> 334 six.reraise(*error[0]) 335 else: 336 return result[0] ~/miniconda/envs/img_exp/lib/python3.7/site-packages/six.py in reraise(tp, value, tb) 691 if value.__traceback__ is not tb: 692 raise value.with_traceback(tb) --> 693 raise value 694 finally: 695 value = None ~/miniconda/envs/img_exp/lib/python3.7/site-packages/distributed/utils.py in f() 317 if timeout is not None: 318 future = gen.with_timeout(timedelta(seconds=timeout), future) --> 319 result[0] = yield future 320 except Exception as exc: 321 error[0] = sys.exc_info() ~/miniconda/envs/img_exp/lib/python3.7/site-packages/tornado/gen.py in run(self) 733 734 try: --> 735 value = future.result() 736 except Exception: 737 exc_info = sys.exc_info() ~/miniconda/envs/img_exp/lib/python3.7/site-packages/tornado/gen.py in run(self) 740 if exc_info is not None: 741 try: --> 742 yielded = self.gen.throw(*exc_info) # type: ignore 743 finally: 744 # Break up a reference to itself ~/miniconda/envs/img_exp/lib/python3.7/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker) 1655 exc = CancelledError(key) 1656 else: -> 1657 six.reraise(type(exception), exception, traceback) 1658 raise exc 1659 if errors == "skip": ~/miniconda/envs/img_exp/lib/python3.7/site-packages/six.py in reraise(tp, value, tb) 690 value = tp() 691 if value.__traceback__ is not tb: --> 692 raise value.with_traceback(tb) 693 raise value 694 finally: ~/miniconda/envs/img_exp/lib/python3.7/site-packages/dask/optimization.py in __call__() 1057 if not len(args) == len(self.inkeys): 1058 raise ValueError("Expected %d args, got %d" % (len(self.inkeys), len(args))) -> 1059 return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args))) 1060 1061 def __reduce__(self): ~/miniconda/envs/img_exp/lib/python3.7/site-packages/dask/core.py in get() 147 for key in toposort(dsk): 148 task = dsk[key] --> 149 result = _execute_task(task, cache) 150 cache[key] = result 151 result = _execute_task(out, cache) ~/miniconda/envs/img_exp/lib/python3.7/site-packages/dask/core.py in _execute_task() 117 func, args = arg[0], arg[1:] 118 args2 = [_execute_task(a, cache) for a in args] --> 119 return func(*args2) 120 elif not ishashable(arg): 121 return arg in itk_RichardsonLucyDeconvolutionImageFilter() 8 9 # Get ITK Image buffers from NumPy arrays (assumes 3-D float32 arrays) ---> 10 img = itk.PyBuffer[itk.Image.F3].GetImageFromArray(img) 11 psf = itk.PyBuffer[itk.Image.F3].GetImageFromArray(psf) 12 AttributeError: module 'itk' has no attribute 'PyBuffer' ```

jakirkham commented 5 years ago

Have added this (admittedly kludgy) piece of code to try to import itk beforehand and try accessing some things from it. This seems to mitigate the issue. Should add importing itk beforehand alone does not help the issue.

jakirkham commented 5 years ago

Am curious if ITK is holding the GIL during these operations. Noticing that workers that are running the deconvolution are not updating any of their usage statistics. Though htop tells a different story.

jakirkham commented 5 years ago

Have added this (admittedly kludgy) piece of code to try to import itk beforehand and try accessing some things from it. This seems to mitigate the issue. Should add importing itk beforehand alone does not help the issue.

Seems this still doesn't always work. After getting a few jobs running. One failed with this AttributeError again.

thewtex commented 5 years ago

@jakirkham @rxist525 @mrocklin sorry for the delayed follow-up on this -- I have been moving houses over the last few days. Awesome work. I am now back online and will take a look!

jakirkham commented 5 years ago

No worries. Congrats on the new house! Please let me know if I can help out.

thewtex commented 5 years ago

Am curious if ITK is holding the GIL during these operations. Noticing that workers that are running the deconvolution are not updating any of their usage statistics. Though htop tells a different story.

Good observation! Most operations are multi-threaded, but they are not releasing the GIL, which may be important for Dask and more ... I have started a patch to address this.

Right now I'm actually running into some issues earlier on. For instance I'm not able to pickle parts of ITK (see below). This is important for Dask to distribute tasks to multiple workers. Am attempting some workarounds (local imports), but run into other import issues that I've yet to reduce to MCVEs.

To provide some context on what is happening here -- the itk package and ITK modules contained within are LazyITKModule's. This means that the modules and their dependencies are loaded on demand. This means import itk is extremely quick, and when any attribute is used on itk only the dependent modules get loaded, which reduces load time.

A workaround to disable this behavior is:

import itkConfig
itkConfig.LazyLoading = False
# Now everything is loaded
import itk

That said, I would like to avoid the need for this with Dask if possible.

LazyITKModule inherits from the vanilla Python types.ModuleType. I noticed that modules in general are not pickle-able. For example,

import sys
import pickle
pickle.dumps(sys)

fails in the same way as

import itk
import pickle
pickle.dumps(itk)

LazyITKModule is masking some of the methods of ModuleType, though, so I will see if exposing those again fixes these issues.

thewtex commented 5 years ago

@jakirkham btw, I reproduced your blog post with @rxist525 's data -- very well done. I tried a few things regarding performance:

The blosc compression parameters: compressor = Blosc(cname='zstd', clevel=3, shuffle=Blosc.BITSHUFFLE) significantly improved compression time with the same size. Wall time was 5 m 11 sec versus 7 m 9 sec and compressed size 81 G for both. This is consistent with testing that @dzenanz did on a variety of scientific imaging datasets. zstd has a better compression ratio for a given compression level while being a little slower than lz4. For imaging data, a compression level higher than 3 does not result in a significant size reduction. The BITSHUFFLE feature of blosc improves performance significantly. Do you think these defaults are something we can incorporate into standards for zarr usage with imaging data?
The %timeit performance for skimage.io.imread, itk.imread, and imageio.imread on the .tif was 313 ms, 277 ms and 1.57 ms -- imageio.imread shines in this case! @almarklein

jakirkham commented 5 years ago

Thanks for the follow-ups and work here, @thewtex. 🙂

Am curious if ITK is holding the GIL during these operations. Noticing that workers that are running the deconvolution are not updating any of their usage statistics. Though htop tells a different story.

Good observation! Most operations are multi-threaded, but they are not releasing the GIL, which may be important for Dask and more ... I have started a patch to address this.

Great! Thanks for working on this. Please let me know if there is something we can try out here or if you need another pair of eyes on the patch.

Right now I'm actually running into some issues earlier on. For instance I'm not able to pickle parts of ITK (see below). This is important for Dask to distribute tasks to multiple workers. Am attempting some workarounds (local imports), but run into other import issues that I've yet to reduce to MCVEs.

To provide some context on what is happening here -- the itk package and ITK modules contained within are LazyITKModule's. This means that the modules and their dependencies are loaded on demand. This means import itk is extremely quick, and when any attribute is used on itk only the dependent modules get loaded, which reduces load time.

Yep, this makes sense. Was wondering if something like this was going on under-the-hood.

A workaround to disable this behavior is:
import itkConfig
itkConfig.LazyLoading = False
# Now everything is loaded
import itk
That said, I would like to avoid the need for this with Dask if possible.

Agreed. I'm hoping we won't need this workaround.

LazyITKModule inherits from the vanilla Python types.ModuleType. I noticed that modules in general are not pickle-able. For example,
import sys
import pickle
pickle.dumps(sys)
fails in the same way as
import itk
import pickle
pickle.dumps(itk)
LazyITKModule is masking some of the methods of ModuleType, though, so I will see if exposing those again fixes these issues.

FWIW Dask uses cloudpickle under-the-hood. This will fallback to the builtin pickle if it doesn't work. So we are able to pickle things like sys.

```python In [1]: import pickle ...: import cloudpickle In [2]: import sys In [3]: pickle.dumps(sys) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 pickle.dumps(sys) TypeError: can't pickle module objects In [4]: cloudpickle.dumps(sys) Out[4]: b'\x80\x04\x953\x00\x00\x00\x00\x00\x00\x00\x8c\x17cloudpickle.cloudpickle\x94\x8c\tsubimport\x94\x93\x94\x8c\x03sys\x94\x85\x94R\x94.' In [5]: cloudpickle.loads(cloudpickle.dumps(sys)) Out[5]: ```

However we seem to run into issues when pickling itk.

```python In [1]: import pickle ...: import cloudpickle In [2]: import itk In [3]: pickle.dumps(itk) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 pickle.dumps(itk) TypeError: can't pickle LazyITKModule objects In [4]: cloudpickle.dumps(itk) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 cloudpickle.dumps(itk) ~/miniconda/envs/img_exp/lib/python3.7/site-packages/cloudpickle/cloudpickle.py in dumps(obj, protocol) 1106 try: 1107 cp = CloudPickler(file, protocol=protocol) -> 1108 cp.dump(obj) 1109 return file.getvalue() 1110 finally: ~/miniconda/envs/img_exp/lib/python3.7/site-packages/cloudpickle/cloudpickle.py in dump(self, obj) 471 self.inject_addons() 472 try: --> 473 return Pickler.dump(self, obj) 474 except RuntimeError as e: 475 if 'recursion' in e.args[0]: ~/miniconda/envs/img_exp/lib/python3.7/pickle.py in dump(self, obj) 435 if self.proto >= 4: 436 self.framer.start_framing() --> 437 self.save(obj) 438 self.write(STOP) 439 self.framer.end_framing() ~/miniconda/envs/img_exp/lib/python3.7/pickle.py in save(self, obj, save_persistent_id) 522 reduce = getattr(obj, "__reduce_ex__", None) 523 if reduce is not None: --> 524 rv = reduce(self.proto) 525 else: 526 reduce = getattr(obj, "__reduce__", None) TypeError: can't pickle LazyITKModule objects ```

jakirkham commented 5 years ago

@jakirkham btw, I reproduced your blog post with @rxist525 's data -- very well done. I tried a few things regarding performance

This is great! Thanks for doing this, @thewtex. 🙂

If you're interested, it might be worth sending a PR to update the blogpost with these additional suggestions about how to configure storage using Zstd and using other imread functions.

thewtex commented 5 years ago

Great! Thanks for working on this. Please let me know if there is something we can try out here or if you need another pair of eyes on the patch.

Thanks! A review on this patch would be appreciated: https://github.com/InsightSoftwareConsortium/ITK/pull/1065

thewtex commented 5 years ago

If you're interested, it might be worth sending a PR to update the blogpost with these additional suggestions about how to configure storage using Zstd and using other imread functions.

Certainly! Done in PR #40

thewtex commented 5 years ago

FWIW Dask uses cloudpickle under-the-hood. This will fallback to the builtin pickle if it doesn't work. So we are able to pickle things like sys.

Digging in here, it seems that cloudpickle does have extensions for pickling types.ModuleType. But, this route is not used for ITKLazyModule. It appears that cloudpickle has instrumentation for some type of plugin system, although I am not aware of examples of how it is used -- pointers welcome ;-).

In the process of investigating this issue, I created this patch for PEP 366 compliance, but more will mostly be required.

jakirkham commented 5 years ago

I think if you wanted to use that, you would subclass CloudPickler and override that method. We could then use loads and dumps from your CloudPickler subclass. That's my guess anyways. Haven't used that functionality before. 🙂

Great! Thanks for the patch. Is there a good way for us to retry things with that patch? Are there nightlies for instance? Or should we try building ITK ourselves?

mrocklin commented 5 years ago

Checking in here. Are there current blockers that stop us from making progress here? If so, what are they?

thewtex commented 5 years ago

Well, the pickling and GIL issues have been fixed in ITK 5.0.1. :rabbit: :racehorse:

As a next step, we should inspect our input Zarr microscopy data so we will understand the impact our future processing makes. As a first step along that front, I will work on https://github.com/InsightSoftwareConsortium/itkwidgets/issues/44, and see if we can generate a blog post on that. It might be too slow, though -- the next step is to generate a multi-resolution pyramid.

rxist525 commented 5 years ago

please let me know if I can be of any help.

On Wed, Jul 31, 2019 at 2:57 PM Matt McCormick notifications@github.com wrote:

Well, the pickling and GIL issues have been fixed in ITK 5.0.1. 🐰 🐎

As a next step, we should inspect our input Zarr microscopy data so we will understand the impact our future processing makes. As a first step along that front, I will work on InsightSoftwareConsortium/itkwidgets#44 https://github.com/InsightSoftwareConsortium/itkwidgets/issues/44, and see if we can generate a blog post on that. It might be too slow, though -- the next step is to generate a multi-resolution pyramid.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=ACE3CWL3OFOQLF46QCRQLLTQCIDFVA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3IV5HA#issuecomment-517037724, or mute the thread https://github.com/notifications/unsubscribe-auth/ACE3CWM34T4UYJBONSAIRDDQCIDFVANCNFSM4HR54PWQ .

mrocklin commented 5 years ago

Hey Gokul. Sorry for the delay. I'll take a look through things John's notebook now and see what the status is.

On Thu, Aug 1, 2019 at 5:52 PM Gokul U notifications@github.com wrote:

please let me know if I can be of any help.

On Wed, Jul 31, 2019 at 2:57 PM Matt McCormick notifications@github.com wrote:

Well, the pickling and GIL issues have been fixed in ITK 5.0.1. 🐰 🐎

As a next step, we should inspect our input Zarr microscopy data so we will understand the impact our future processing makes. As a first step along that front, I will work on InsightSoftwareConsortium/itkwidgets#44 https://github.com/InsightSoftwareConsortium/itkwidgets/issues/44, and see if we can generate a blog post on that. It might be too slow, though

the next step is to generate a multi-resolution pyramid.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=ACE3CWL3OFOQLF46QCRQLLTQCIDFVA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3IV5HA#issuecomment-517037724 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ACE3CWM34T4UYJBONSAIRDDQCIDFVANCNFSM4HR54PWQ

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=AACKZTBR7QGJRDOOXK5O6W3QCOAOHA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MIRJY#issuecomment-517507239, or mute the thread https://github.com/notifications/unsubscribe-auth/AACKZTFSAD3NLR6X5ZUNDWLQCOAOHANCNFSM4HR54PWQ .

mrocklin commented 5 years ago

Hrm, does ITK release the GIL? https://github.com/InsightSoftwareConsortium/ITK/issues/1134

My workers are locking up (presumably busy processing) but I'm unable to see what's going on while they do so.

Also, how long should I expect itk.RichardsonLucyDeconvolutionImageFilter to take?

On Thu, Aug 1, 2019 at 6:26 PM Matthew Rocklin mrocklin@gmail.com wrote:

Hey Gokul. Sorry for the delay. I'll take a look through things John's notebook now and see what the status is.

On Thu, Aug 1, 2019 at 5:52 PM Gokul U notifications@github.com wrote:

please let me know if I can be of any help.

On Wed, Jul 31, 2019 at 2:57 PM Matt McCormick notifications@github.com wrote:

Well, the pickling and GIL issues have been fixed in ITK 5.0.1. 🐰 🐎

As a next step, we should inspect our input Zarr microscopy data so we will understand the impact our future processing makes. As a first step along that front, I will work on InsightSoftwareConsortium/itkwidgets#44 https://github.com/InsightSoftwareConsortium/itkwidgets/issues/44, and see if we can generate a blog post on that. It might be too slow, though -- the next step is to generate a multi-resolution pyramid.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=ACE3CWL3OFOQLF46QCRQLLTQCIDFVA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3IV5HA#issuecomment-517037724 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ACE3CWM34T4UYJBONSAIRDDQCIDFVANCNFSM4HR54PWQ

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask-blog/issues/27?email_source=notifications&email_token=AACKZTBR7QGJRDOOXK5O6W3QCOAOHA5CNFSM4HR54PW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MIRJY#issuecomment-517507239, or mute the thread https://github.com/notifications/unsubscribe-auth/AACKZTFSAD3NLR6X5ZUNDWLQCOAOHANCNFSM4HR54PWQ .

mrocklin commented 5 years ago

Also, cc'ing @madsbk , who I noticed on a related discourse topic

mrocklin commented 5 years ago

It seems like we're spending some time working around the itk Python library. I'm curious how hard it would be to wrap some of these functions directly with a tool like Cython or Numba.

Looking a bit at @jakirkham 's code here:

def itk_RichardsonLucyDeconvolutionImageFilter(img, psf):
    # The module does not pickle
    import itk

    # Drop first two singleton axes (need 3-D array at most)
    img = img[0, 0]
    psf = psf[0, 0]

    # Get ITK Image buffers from NumPy arrays (assumes 3-D float32 arrays)
    img = itk.PyBuffer[itk.Image.F3].GetImageFromArray(img)
    psf = itk.PyBuffer[itk.Image.F3].GetImageFromArray(psf)

    # Configure filter and get result reference
    op = itk.RichardsonLucyDeconvolutionImageFilter[itk.Image.F3, itk.Image.F3].New()
    op.SetInput(img)
    op.SetKernelImage(psf)
    r = op.GetOutput()

    # Compute result and view as NumPy array
    r.Update()
    r = itk.PyBuffer[itk.Image.F3].GetArrayViewFromImage(r)

    # Readd singleton axes
    r = r[None, None]

    return r

It looks like the itk Python library is more or less a direct translation of a C/C++ codebase. Would it make sense to use something like Cython or Numba to construct a more Pythonic system here? Something that would, for example, consume and produce NumPy arrays and have an API more similar to Scikit-Image? To be clear, I'm not suggesting that this group completely re-implement the itk Python library, but for prototyping purposes I'm curious how hard it would be to wrap up the functions that we need. This would both help us to move forward, and also maybe serve as an example for future itk development. Just a thought though, this is probably naive in some way. I'm not sure that anyone has the time to do this if it's going to take more than a couple days of work.

mrocklin commented 5 years ago

It also looks like scikit-image has this operation? https://scikit-image.org/docs/dev/auto_examples/filters/plot_deconvolution.html

How does that compare? Could we use that to move past things here?

mrocklin commented 5 years ago

If we were able to wrap the C/C++ code with Numba that would also make it easy to use gufuncs, which would give us a Dask interface for free. I'm not sure that ITK has a C interface though and I'm not sure that Numba has any nice way to interface with C++ codes.

thewtex commented 5 years ago

Hrm, does ITK release the GIL?

It should with ITK 5.0.1, but I will verify.

Also, how long should I expect itk.RichardsonLucyDeconvolutionImageFilter to take?

On a chunk on my system, it takes less than 2 minutes.

It seems like we're spending some time working around the itk Python library. I'm curious how hard it would be to wrap some of these functions directly with a tool like Cython or Numba.

We made quite a bit of progress, and I believe we are close. We can avoid huge amounts of duplicated effort by leveraging the itk Python library!

Would it make sense to use something like Cython or Numba to construct a more Pythonic system here?

itk is in the process of exposing a more Pythonic API. Here is a more Pythonic version of the initial function:

def richardson_lucy_deconvolution(img, psf, iterations=1):
    image = itk.image_view_from_array(np.ascontiguousarray(img))
    kernel = itk.image_view_from_array(np.ascontiguousarray(psf))

    deconvolved = itk.richardson_lucy_deconvolution_image_filter(image,
                                                                 kernel_image=kernel,
                                                                 number_of_iterations=iterations)
    return itk.array_from_image(deconvolved)

But, there is still room for improvement -- for example, we will support return np.asarray(deconvolved) in the next release. Suggestions and contributions are welcome :-).

Here is a notebook to move us along:

https://gist.github.com/thewtex/29a0b7681a107d28c969fe6d1b477f89

By using the itkwidgets to inspect the point-spread function, I noticed that it could be trimmed to reduce computation:

After deconvolution, the histogram is smoother:

and the image is cleaner:

I did not experiment with the number of iterations...

mrocklin commented 5 years ago

Hrm, does ITK release the GIL?

It should with ITK 5.0.1, but I will verify.

OK, I could be off somehow. I ran through things relatively quickly, so it's quite possible that I missed something.

It seems like we're spending some time working around the itk Python library. I'm curious how hard it would be to wrap some of these functions directly with a tool like Cython or Numba.

We made quite a bit of progress, and I believe we are close. We can avoid huge amounts of duplicated effort by leveraging the itk Python library!

To be clear, I'm not suggesting a competing effort. I'm just curious how hard it would be to wrap up the C++ functions with Cython/Numba in case we need to move faster for a short time. This might also be valuable experimentation for the mainline itk codebase.

Suggestions and contributions are welcome :-).

The API that you show looks really nice. From my perspective we would be able to deal with NumPy arrays in and out, removing the need for image_view_from_array and array_from_image but perhaps I'm missing context about extra metadata that is on images that might be important. From my perspective the requirement to always use ITK data structures makes it hard to compose ITK with other SciPy libraries.

I would also love to see GUFunc support. This provides a number of benefits when you want to apply the same function to many images in a loop. These will also work natively on Dask arrays, so there is no need to call functions like map_blocks. The easiest way I know to make a GUFunc today is with Numba. I do this in this blogpost about halfway down. See also https://github.com/numpy/numpy/issues/14020.

mrocklin commented 5 years ago

Relevant excerpt about GUFuncs from that blogpost

Generalized Universal Functions ------------------------------- **Numba Docs:** https://numba.pydata.org/numba-doc/dev/user/vectorize.html **NumPy Docs:** https://docs.scipy.org/doc/numpy-1.16.0/reference/c-api.generalized-ufuncs.html A generalized universal function (gufunc) is a normal function that has been annotated with typing and dimension information. For example we can redefine our `smooth` function as a gufunc as follows: ```python @numba.guvectorize( [(numba.int8[:, :], numba.int8[:, :])], '(n, m) -> (n, m)' ) def smooth(x, out): out[:] = _smooth(x) ``` This function knows that it consumes a 2d array of int8's and produces a 2d array of int8's of the same dimensions. This sort of annotation is a small change, but it gives other systems like Dask enough information to use it intelligently. Rather than call functions like `map_blocks`, we can just use the function directly, as if our Dask Array was just a very large NumPy array. ```python # Before gufuncs y = x.map_blocks(smooth, dtype='int8') # After gufuncs y = smooth(x) ``` This is nice. If you write library code with gufunc semantics then that code just works with systems like Dask, without you having to build in explicit support for parallel computing. This makes the lives of users much easier.

mrocklin commented 5 years ago

I notice that ITK occasionally uses a bunch of threads at once. Is there an easy way to turn this off? For example, does it suffice to set OMP_NUM_THREADS=1 (my guess is not from looking at the docs)?

mrocklin commented 5 years ago

I have decent confidence that these particular ITK functions are not releasing the GIL. This is for a few reasons:

When I run my computation within two threads, I only see 100% CPU usage (except for occasional spikes when I think ITK is going multi-threaded itself)
When I run my computation in two processes, I do see 200% CPU usage by default, so this isn't just an I/O bound thing
I have difficulty breaking out of a script with Keyboard interrupts/Ctrl-C when it's running an ITK function
My previous experience with workers locking up

mrocklin commented 5 years ago

Here is my current script, which combines @jakirkham 's setup with @thewtex 's ITK function. I decided to use Numba just to annotate the function with guvectorize so that I can use the same function with either Numpy or Dask arrays without having to fiddle about with dask.array.map_blocks.

import numpy as np
import numba
import itk
import zarr
import dask.array as da
import os
import dask.array.image

@numba.guvectorize(
    ["float32[:,:,:], float32[:,:,:], float32[:,:,:]"],
    "(i,j,k),(a,b,c)->(i,j,k)",
    forceobj=True,
)
def richardson_lucy_deconvolution(img, psf, out):
    iterations = 1
    image = itk.image_view_from_array(np.ascontiguousarray(img))
    kernel = itk.image_view_from_array(np.ascontiguousarray(psf))

    deconvolved = itk.richardson_lucy_deconvolution_image_filter(
        image, kernel_image=kernel, number_of_iterations=iterations
    )
    out[:] = itk.array_from_image(deconvolved)

if __name__ == "__main__":
    from dask.distributed import Client
    # Load image data
    z = zarr.open("/public/AOLLSMData_m4_raw.zarr/")
    imgs = da.from_zarr("/public/AOLLSMData_m4_raw.zarr/", "data")

    # Get psf
    dirpath = "/public/AOLLSMData/m4/psfs_z0p1/"
    psf = []
    for fn in sorted(os.listdir(dirpath)):
        fn = os.path.join(dirpath, fn)
        psf.append(dask.array.image.imread(fn))
    psf = da.stack(psf)

    imgs = imgs.astype(np.float32)
    psf = psf.astype(np.float32)

    out = richardson_lucy_deconvolution(imgs, psf)

    with Client(n_workers=2, threads_per_worker=1):  # two processes with one thread each
        out[0].sum().compute()  # do a bit of computation and then just return the sum for now

mrocklin commented 5 years ago

I ran into these two numba issues along the way:

mrocklin commented 5 years ago

To be clear here, I wouldn't recommend that we ask users to create gufunc signatures, but this is the sort of thing that developers could do in library code that would help make libraries work seamlessly for users who might want to use Dask in the future.

mrocklin commented 5 years ago

In another approach, which avoids using Numba, I get this

  File "/home/nfs/mrocklin/miniconda/envs/image/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 62, in pickle_loads
    return pickle.loads(b"".join(frames))
  File "/home/nfs/mrocklin/miniconda/envs/image/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 59, in loads
    return pickle.loads(x)
ModuleNotFoundError: No module named '_itkVectorThresholdSegmentationLevelSetImageFilterPython'

mrocklin commented 5 years ago

If I hide the itk import within the function things seem to run. Here is a vesion that is less sophisticated, and is probably more appropriate to give to users:

https://nbviewer.jupyter.org/urls/gist.githubusercontent.com/mrocklin/2de0244ae306cab715fd20be7cec4322/raw/c15b1876ef88fe1e1151c85e6f0677a199ee6711/AOLLSM_ITK.ipynb

I think that this process has been helpful in highlighting a few pain points. The solution in the notebook above seems like it is simple enough for downstream users, but with some warnings about some gotchas:

itk imports still don't seem to play perfectly with serialization (though they're doable)
Dask array's map_blocks is a bit finicky when you have extra dimensions lying around
We need to convert back and forth from NumPy to ITK data structures. https://github.com/InsightSoftwareConsortium/ITK/issues/1136
We need to make sure to use many single-threaded processes, rather than a few many-threaded processes https://github.com/InsightSoftwareConsortium/ITK/issues/1134
Maybe limit multi-threading with ITK from Python (or environment variables) https://github.com/InsightSoftwareConsortium/ITK/issues/1137

However, these seem like issues that modestly skilled users can probably navigate short term, and that developers can probably address medium term. I'm in favor of both working on those issues as well as emitting this content in its current technical state.

What is the next operation? Segmentation? @rxist525 are you also using ITK for segmentation? If so, which operations?

dask / dask-blog

Blogpost using Dask Array with ITK #27