mikemccand / luceneutil

Various utility scripts for running Lucene performance tests
Apache License 2.0
205 stars 115 forks source link

Add chart showing total merge time by part of index #217

Open mikemccand opened 1 year ago

mikemccand commented 1 year ago

Lucene's infoStream now logs how long each part of the index (postings, doc values, points, vectors, etc.) take to merge. It should be simple to parse these logs from nightly benchmarks and chart these merge costs over time?

mikemccand commented 1 year ago

Hmm, except, we do not enable infoStream logging in the nightly benchmarks! We could enable it for the single-threaded index, since we don't otherwise count its metrics like docs/sec.

rmuir commented 1 year ago

yes this would be nice when discussing issues such as https://github.com/apache/lucene/issues/12203 otherwise, I think merges are currently too opaque when discussing index performance: but we "know" certain parts are way slower than others.

honestly as an even simpler start we could simply make the infostream.txt available somehow (if it isn't too insane amount of disk space). then at least we could investigate the current breakdown.