jimmejardine / qiqqa-open-source

The open-sourced version of the award-winning Qiqqa research management tool for Windows
GNU General Public License v3.0
373 stars 61 forks source link

Same as #17 (unresponsive Qiqqa) but now for large Watch directory sets or RESET Watch Directories #20

Closed GerHobbelt closed 5 years ago

GerHobbelt commented 5 years ago

Same type of behaviour as #17, either due to dropping a ton of PDFs in the Watch Folder or resetting / changing the Watch Folder, while pointing to a large set of PDFs.

For a 20K+ library, this can take ages and keeps going in the background "almost indefinitely" even after the user closed Qiqqa: this is undesirable behaviour.

The Qiqqa.log logfile gets littered with reams of this stuff:

20190802.180827 INFO [11] FolderWatcher file_system_watcher_Created 20190802.180827 INFO [11] FolderWatcher file_system_watcher_Changed 20190802.180827 INFO [11] FolderWatcher file_system_watcher_Changed 20190802.180827 INFO [11] FolderWatcher file_system_watcher_Created 20190802.180827 INFO [11] FolderWatcher file_system_watcher_Changed 20190802.180827 INFO [11] FolderWatcher file_system_watcher_Changed 20190802.180827 INFO [Main] Waiting for Maintainable Qiqqa.Common.GeneralTaskDaemonStuff.GeneralTaskDaemon:DoMaintenance to terminate. 20190802.180827 INFO [11] FolderWatcher file_system_watcher_Created 20190802.180827 INFO [21] FolderWatcher file_system_watcher_Changed 20190802.180827 INFO [21] FolderWatcher file_system_watcher_Changed

GerHobbelt commented 5 years ago

Done as per #33.

Commits:

Revision: dc740d77b3893262fac573523309a617a9c99389 fix/tweak FolderWatcher background task: make sure we AT LEAST process ONE(1) tiny batch of PDF files when there are any to process.

Revision: 0b7d3b4674082f610ac12a074f2706579fc8ae49 fix/tweak: do NOT report 'Adds 0 of 0 document(s)' but clear the status part instead: now that we make Qiqqa work in small batches, this sort of thing MAY happen. (TODO: review WHY the Length of the todo array is actually ZERO, but low priority as things work and don't b0rk)

Revision: da3f8531f0e0baf14a45c46db199b4160b6cb3bf corrected Folder Watch loop + checks for https://github.com/jimmejardine/qiqqa-open-source/issues/20: the intent here is very similar to the code done previously for https://github.com/jimmejardine/qiqqa-open-source/issues/17; we just want to add a tiny batch of PDF files from the Watch folder, irrespective of the amount of files waiting there to be added.

Revision: 8a1d7660659079939e59be74bf3822ea6311a205 Fix https://github.com/jimmejardine/qiqqa-open-source/issues/17 by processing PDFs in any Qiqqa library in small batches so that Qiqqa is not unreponsive for a loooooooooooooong time when it is re-indexing/upgrading/whatever a large library, e.g. 20K+ PDF files. The key here is to make the 'infrequent background task' produce some result quickly (like a working, yet incomplete, Lucene search index DB!) and then updating/augmenting that result as time goes by. This way, we can recover a search index for larger Qiqqa libraries!

GerHobbelt commented 5 years ago

Closing and decluttering the issue list so it stays workable for me: fixed in https://github.com/GerHobbelt/qiqqa-open-source mainline=master branch, pending #15 / any maintainer rights/actions.