rnewson / couchdb-lucene

Enables full-text searching of CouchDB documents using Lucene
Apache License 2.0
769 stars 145 forks source link

Benchmark on performance of Indexing and Searching #253

Open uditabose opened 7 years ago

uditabose commented 7 years ago

Hi, For my project, I have to about 700GB of text and HTML which produces Lucene index about 19GB. The indexing process took about 30 hours last time. Do you have any data on benchmarking? Things I need to understand are -

  1. Does CouchDB fetch performance deteriorate if the indexing added to it?
  2. What impact couchdb-lucene had on indexing performance? Is performance similar to a native java client ?
rnewson commented 7 years ago

I don't have benchmarking data for Lucene itself beyond the general observation that it's pretty fast.

couchdb-lucene will add some cost as it needs to evaluate javascript functions to produce the output.