dhatim / fastexcel

Generate and read big Excel files quickly
Other
672 stars 122 forks source link

update poi read streaming benchmark #181

Closed pjfanning closed 2 years ago

pjfanning commented 2 years ago

Thanks for maintaining this great project. I'm a POI PMC member and maintainer of a fork of monitorjbl's excel-streaming-reader.

I updated your benchmark for POI reading with streaming. The test was set to read in styles which is not needed for the test as it is written. Styles are only needed if you want to use POI DataFormatter to format the data. It has a material effect on the test to read that unnecessary data.

I commented out the monitorjbl test because it crashes with the POI version that is loaded. It can be re-enabled if the POI version is forced to version 4 (eg 4.1.2).

I plan on testing my fork of monitorjbl. I'm adding some extra tuning features to it (eg to optionally not load styles). It will probably not match fastexcel for speed (but has a few extra features - so far, I've prioritised features over performance).

Good news is fastexcel has best read result but the POI streaming catches up a bit.

Benchmark                                        Mode  Cnt  Score   Error  Units
ReaderBenchmark.apachePoi                          ss   15  3.259 ± 0.784   s/op
ReaderBenchmark.fastExcelReader                    ss   15  0.462 ± 0.069   s/op
ReaderBenchmark.streamingApachePoiWithStyles       ss   15  2.387 ± 0.088   s/op
ReaderBenchmark.streamingApachePoiWithoutStyles    ss   15  0.887 ± 0.045   s/op