privacore / open-source-search-engine

No longer maintained. Please read our shutdown message.
https://privacore.github.io/
Other
103 stars 14 forks source link

Many Log Messages: INF merge: Considering merging 0 #57

Closed Outstep closed 6 years ago

Outstep commented 6 years ago

Hello,

I am getting some interesting messages in my logs and am wondering what they mean and what is the best way to resolve them:


20180602-112522-217 0000 000008 INF merge: Using min files to merge 6 for posdb 20180602-112522-217 0000 000008 INF merge: Considering merging 0 posdb files on disk. 6 files needed to trigger a merge. 20180602-112522-218 0000 000008 INF merge: Using min files to merge 6 for titledb 20180602-112522-218 0000 000008 INF merge: Considering merging 0 titledb files on disk. 6 files needed to trigger a merge. 20180602-112522-218 0000 000008 INF merge: Using min files to merge 2 for tagdb 20180602-112522-218 0000 000008 INF merge: Considering merging 0 tagdb files on disk. 2 files needed to trigger a merge. 20180602-112522-218 0000 000008 INF merge: Using min files to merge 6 for linkdb 20180602-112522-218 0000 000008 INF merge: Considering merging 0 linkdb files on disk. 6 files needed to trigger a merge. 20180602-112522-218 0000 000008 INF merge: Using min files to merge 2 for spiderdb 20180602-112522-218 0000 000008 INF merge: Considering merging 0 spiderdb files on disk. 2 files needed to trigger a merge. 20180602-112522-218 0000 000008 INF merge: Using already-set m_minToMergeDefault of 2 for clusterdb 20180602-112522-218 0000 000008 INF merge: Considering merging 0 clusterdb files on disk. 2 files needed to trigger a merge. 20180602-112622-224 0000 000008 INF merge: Using min files to merge 6 for posdb 20180602-112622-224 0000 000008 INF merge: Considering merging 0 posdb files on disk. 6 files needed to trigger a merge. 20180602-112622-224 0000 000008 INF merge: Using min files to merge 6 for titledb 20180602-112622-224 0000 000008 INF merge: Considering merging 0 titledb files on disk. 6 files needed to trigger a merge. 20180602-112622-224 0000 000008 INF merge: Using min files to merge 2 for tagdb 20180602-112622-224 0000 000008 INF merge: Considering merging 0 tagdb files on disk. 2 files needed to trigger a merge. 20180602-112622-224 0000 000008 INF merge: Using min files to merge 6 for linkdb 20180602-112622-224 0000 000008 INF merge: Considering merging 0 linkdb files on disk. 6 files needed to trigger a merge. 20180602-112622-224 0000 000008 INF merge: Using min files to merge 2 for spiderdb 20180602-112622-224 0000 000008 INF merge: Considering merging 0 spiderdb files on disk. 2 files needed to trigger a merge. 20180602-112622-224 0000 000008 INF merge: Using already-set m_minToMergeDefault of 2 for clusterdb 20180602-112622-224 0000 000008 INF merge: Considering merging 0 clusterdb files on disk. 2 files needed to trigger a merge. 20180602-112722-232 0000 000008 INF merge: Using min files to merge 6 for posdb 20180602-112722-233 0000 000008 INF merge: Considering merging 0 posdb files on disk. 6 files needed to trigger a merge. 20180602-112722-233 0000 000008 INF merge: Using min files to merge 6 for titledb 20180602-112722-233 0000 000008 INF merge: Considering merging 0 titledb files on disk. 6 files needed to trigger a merge. 20180602-112722-233 0000 000008 INF merge: Using min files to merge 2 for tagdb 20180602-112722-233 0000 000008 INF merge: Considering merging 0 tagdb files on disk. 2 files needed to trigger a merge. 20180602-112722-233 0000 000008 INF merge: Using min files to merge 6 for linkdb 20180602-112722-233 0000 000008 INF merge: Considering merging 0 linkdb files on disk. 6 files needed to trigger a merge. 20180602-112722-233 0000 000008 INF merge: Using min files to merge 2 for spiderdb 20180602-112722-233 0000 000008 INF merge: Considering merging 0 spiderdb files on disk. 2 files needed to trigger a merge. 20180602-112722-233 0000 000008 INF merge: Using already-set m_minToMergeDefault of 2 for clusterdb 20180602-112722-233 0000 000008 INF merge: Considering merging 0 clusterdb files on disk. 2 files needed to trigger a merge. 20180602-112822-233 0000 000008 INF merge: Using min files to merge 6 for posdb 20180602-112822-233 0000 000008 INF merge: Considering merging 0 posdb files on disk. 6 files needed to trigger a merge. 20180602-112822-234 0000 000008 INF merge: Using min files to merge 6 for titledb 20180602-112822-234 0000 000008 INF merge: Considering merging 0 titledb files on disk. 6 files needed to trigger a merge. 20180602-112822-234 0000 000008 INF merge: Using min files to merge 2 for tagdb 20180602-112822-234 0000 000008 INF merge: Considering merging 0 tagdb files on disk. 2 files needed to trigger a merge. 20180602-112822-234 0000 000008 INF merge: Using min files to merge 6 for linkdb 20180602-112822-234 0000 000008 INF merge: Considering merging 0 linkdb files on disk. 6 files needed to trigger a merge. 20180602-112822-234 0000 000008 INF merge: Using min files to merge 2 for spiderdb 20180602-112822-234 0000 000008 INF merge: Considering merging 0 spiderdb files on disk. 2 files needed to trigger a merge. 20180602-112822-234 0000 000008 INF merge: Using already-set m_minToMergeDefault of 2 for clusterdb 20180602-112822-234 0000 000008 INF merge: Considering merging 0 clusterdb files on disk. 2 files needed to trigger a merge.


Can you please tell me more about this?

Cheers,

br-privacore commented 6 years ago

When it has switched to using a new data file, it never writes to the previous one again. So if a page has been spidered and stored in file 1, and respidered and also stored in file 3, you'll have two copies where it only uses the newest. Periodically, it picks two files to merge, to get rid of obsolete records. The info logs above tell that it checked if a periodic merge should be done - but it didn't find enough.

it is suspicious that it finds 0 files, unless it is an empty instance logging it.