Open ctuguinay opened 4 months ago
This is interesting. I was just running a notebook on my local machine and did not observe the accumulation during open_raw
, but observed the accumulation during combine_echodata
. What's really interesting was that those accumulated unmanaged memory was later reclaimed when downstream tasks are running, so everything was ok with running the notebook toward the end. I also wonder if Dask version matters.
An issue to track the accumulation of memory when
ep.open_raw
is used with a Dask Cluster. This was seen during the Echodataflow Open Raw -> Sv Flow in the earlier stages of the Shimada ship-to-cloud pipeline.The two ways to resolve this are by restarting/closing the cluster or not using a Dask Cluster at all.
Code to replicate the issue: