Do we have the tools for benchmark on index build?

tang-hi commented 2 years ago

The guide on the readme looks like it's only a search benchmark.I didn't found the benchmark tools for index build.If we have such tools, could you tell me which file? 😄

mikemccand commented 2 years ago

You are correct that the benchmark is search-focused.

But it does also report index size and indexing throughput / total time, and gives you control on whether to include the commit time, the "wait for final merges" time, and even whether to do periodic NRT refreshes during indexing.

If you execute a run you should see at least the index time, size, num segments reported. You can tune the ram buffer, number of indexing threads, etc.

Also, you should use the binary form of the LineFileDocs to minimize CPU cost of decoding the incoming documents and use every bit of CPU that you can to give to Lucene for doing the indexing.

tang-hi commented 2 years ago

You are correct that the benchmark is search-focused.

But it does also report index size and indexing throughput / total time, and gives you control on whether to include the commit time, the "wait for final merges" time, and even whether to do periodic NRT refreshes during indexing.

If you execute a run you should see at least the index time, size, num segments reported. You can tune the ram buffer, number of indexing threads, etc.

Also, you should use the binary form of the LineFileDocs to minimize CPU cost of decoding the incoming documents and use every bit of CPU that you can to give to Lucene for doing the indexing.

thanks, I got it.

mikemccand / luceneutil

Do we have the tools for benchmark on index build? #191