lsst-epo / citizen-science-notebooks

A collection Jupyter notebooks that can be used to associate Rubin Science Platform data to a Zooniverse citizen science project.
3 stars 1 forks source link

Test out the concurrent processing patch in `display_matplotlib` in the RSP Notebook Aspect #98

Closed ericdrosas87 closed 3 months ago

ericdrosas87 commented 4 months ago

User story

As the developer of the citSci pipeline, I need to test out the concurrent processing patch to display_matplotlib to see if my Butler concurrent processing work is now unblocked.

Definition of done

I have verified that the patched display_matplotlib enables concurrent processing or still blocks my work.

ericdrosas87 commented 3 months ago

As it stands, the display_matpotlib package is now thread-safe, but is leaking memory. Apparently, this is issue has been known about since at least 2018. Thankfully, there is a DM ticket that's set to in-progress, but that there is no ETA on delivery so I don't think we can rely on it being delivered any time soon. The best we can do at the moment is encourage for the priority of that DM ticket to be raised.

Full discussion here.

I also tested out the ProcessPoolExecutor rather than the ThreadPoolExecutor and the ProcessPoolExecutor completes retrieving 100 images in half the time of the ThreadPoolExecutor, but it also uses up more memory causing it fail much earlier than the ThreadPoolExecutor on higher source counts.