literature search example dies during indexing

richardpaulhudson / holmes-extractor

Information extraction from English and German texts based on predicate logic

MIT License

135 stars 12 forks source link

literature search example dies during indexing #9

Closed PDDeane closed 2 years ago

PDDeane commented 2 years ago

When I try to run the holmes extractor using this example, it reaches the end of all the parsing, spawns a lot of (?) worker threads during indexing, and then the program gets killed, presumably because it exceeded system resources.

What's going on, and how to fix?

richardpaulhudson commented 2 years ago

Hi @PDDeane, which literature example are you running and which version of Holmes are you running and on which operating system?

PDDeane commented 2 years ago

example_search_EN_literature.py holmes_extractor.4.0.3 ubuntu 20.04 LTS

richardpaulhudson commented 2 years ago

A couple of things you could try:

add number_of_workers=1 at line 24. This means only one worker thread will be created. However, on Ubuntu worker threads are forked rather than spawned, so while this should reduce the CPU load it won't have much impact on memory use.
define a swap file (https://linuxize.com/post/how-to-add-swap-space-on-ubuntu-20-04/)

schorfma commented 2 years ago

Hello,

I encountered the same problem today using Ubuntu 22.04.1 LTS while using the mentioned example or a custom smaller scale example.

define a swap file (https://linuxize.com/post/how-to-add-swap-space-on-ubuntu-20-04/)

Now that I have activated a large enough swap partition, I can confirm that the advice above worked for me. :+1:

(On my system, at peak memory usage 32 GB RAM and about 9-10 GB Swap were used for indexing the Harry Potter corpus. When ready for search total memory usage dropped to a bit more than 32 GB.)