Closed tang-hi closed 2 years ago
You are correct that the benchmark is search-focused.
But it does also report index size and indexing throughput / total time, and gives you control on whether to include the commit time, the "wait for final merges" time, and even whether to do periodic NRT refreshes during indexing.
If you execute a run you should see at least the index time, size, num segments reported. You can tune the ram buffer, number of indexing threads, etc.
Also, you should use the binary form of the LineFileDocs
to minimize CPU cost of decoding the incoming documents and use every bit of CPU that you can to give to Lucene for doing the indexing.
You are correct that the benchmark is search-focused.
But it does also report index size and indexing throughput / total time, and gives you control on whether to include the commit time, the "wait for final merges" time, and even whether to do periodic NRT refreshes during indexing.
If you execute a run you should see at least the index time, size, num segments reported. You can tune the ram buffer, number of indexing threads, etc.
Also, you should use the binary form of the
LineFileDocs
to minimize CPU cost of decoding the incoming documents and use every bit of CPU that you can to give to Lucene for doing the indexing.
thanks, I got it.
The guide on the readme looks like it's only a search benchmark.I didn't found the benchmark tools for index build.If we have such tools, could you tell me which file? 😄