Closed PDDeane closed 2 years ago
Hi @PDDeane, which literature example are you running and which version of Holmes are you running and on which operating system?
example_search_EN_literature.py holmes_extractor.4.0.3 ubuntu 20.04 LTS
A couple of things you could try:
number_of_workers=1
at line 24. This means only one worker thread will be created. However, on Ubuntu worker threads are forked rather than spawned, so while this should reduce the CPU load it won't have much impact on memory use.Hello,
I encountered the same problem today using Ubuntu 22.04.1 LTS while using the mentioned example or a custom smaller scale example.
- define a swap file (https://linuxize.com/post/how-to-add-swap-space-on-ubuntu-20-04/)
Now that I have activated a large enough swap partition, I can confirm that the advice above worked for me. :+1:
(On my system, at peak memory usage 32 GB RAM and about 9-10 GB Swap were used for indexing the Harry Potter corpus. When ready for search total memory usage dropped to a bit more than 32 GB.)
When I try to run the holmes extractor using this example, it reaches the end of all the parsing, spawns a lot of (?) worker threads during indexing, and then the program gets killed, presumably because it exceeded system resources.
What's going on, and how to fix?