Closed radry closed 1 month ago
same here on Ubuntu, even with 64GB of ram
Same here on windows 10 32GB Ram difPy 4.0.1
Process SpawnPoolWorker-21:
Traceback (most recent call last):
File "C:\Python311\Lib\multiprocessing\pool.py", line 131, in worker
put((job, i, result))
File "C:\Python311\Lib\multiprocessing\queues.py", line 371, in put
obj = _ForkingPickler.dumps(obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
MemoryError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Python311\Lib\multiprocessing\process.py", line 314, in _bootstrap
self.run()
File "C:\Python311\Lib\multiprocessing\process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "C:\Python311\Lib\multiprocessing\pool.py", line 134, in worker
util.debug("Possible encoding error while sending result: %s" % (
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
MemoryError
It's still going with 94-98% memory usage. Is it going to finish the process or should I just stop stressing my RAM? @breengles @radry
@KalyaSc I was not able to get it running with large image collection so I just switched to other tools for now. Hopefully, the solution will be found sooner
Hi @radry
Thanks a lot for flagging this issue. This indeed is not intended behaviour and a fix for it will be implemented in the upcoming difPy release. Stay tuned!
Thanks again and best, Elise
Hi all,
difPy v4.1.0 now comes with improved handling of larger datasets, see the guide.
Additionally, the new version lets you adjust the number of processes when Multiprocessing, so in order to reduce memory overhead, you can now manually lower this value. Previously, difPy was set to always use os.cpu_count(). For more details, I can recommend checking the updated documentation of this feature. Nonetheless, keep in mind that lowering the number of simultaneous processes will also lead to a loss in performance, hence computation times will be longer.
Let me know if this helps or if you're still encountering issues.
Best, Elise
Aparently there is no memory limit built in and it will eat as much as it can get from windows. "Preparing Files" completes fine but when searching the differences it eats a lot of memory. My Windows is set up to automatically manage the pagefile and will happily enlarge it until the drive it's located on is full. When that happens following error appears:
The program will continue to run but stop without any further error after some minutes. There is no log file produced.
How to reproduce: Run difPy in command line (similarity s=90) on a directory with ~60.000 images with 16GB ram and limited hard drive space for the pagefile. Let windows manage the pagefile size. Wait for "preparing files" to complete (will take an hour or so).
I don't know what would happen if the pagefile has a fixed size. I assume the same error will appear.
System: Windows 10 16GB Ram difPy 4.0.1