Closed Schwarzam closed 1 month ago
This issue is directly related to #267
A workaround was to lower the pixel_threshold
. Worked for me.
That's interesting! We have a notebook to estimate what your pixel_threshold
should be, according to your data. You could see if the results from the notebook match your new value: https://hipscat-import.readthedocs.io/en/stable/notebooks/estimate_pixel_threshold.html
I am closing this as it seems to be solved for now. Unmanaged memory issues continue to plague us...
I'm generating HIPS over all S-PLUS DR4 dual photometry. The dataset has 160gb, composed by 1412 files.
We use Ubuntu 22.04 40gb of RAM 24 CPU cores
So If I run for a small fraction of the dataset, everything goes fine. But with the whole dataset I'm experiencing some memory issues leading to errors.
I set
Client(memory_limit="20GB")
just to be sure.In the
reducing
step, this warning below is raised multiple times after ~15% progress.I watched the
htop
while running it and the memory increases until it hit the max of the machine and then also it fills the swap. After this, it starts to give errors like:Investigating dask docs at https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os, It seems that a possible solution to linux is to manually free memory with:
The problem is that this seems to be a implementation to free the memory within the client instance in the main thread only.
Any ideia on how to move on with this?