flow-php / flow

Flow PHP - data processing framework
https://flow-php.com
MIT License
483 stars 28 forks source link

Added possibility to set cache batch size #1034

Closed norberttech closed 7 months ago

norberttech commented 7 months ago

Change Log

Added

  • cache batch size configuration

Fixed

Changed

  • Replaced CompressingSerializer with NativeSerizer

Removed

Deprecated

Security


Description

I recently noticed drastic performance degradation in the sorting operation, and since sort is set up to move to file based sorting only after reaching specific memory consumption, it wasn't that easy to nice it in the first place.

Pretty much the problem is a missed regression, after I changed the way how extractors/loaders are working (by default one row at time) I missed the fact that it will also hit the caching pipeline which affects sorting.

To resolve that issue new config entry was added, cache batch size which by default is set to 2000. This means that the caching pipeline will process 2000 rows at once, reducing the number of I/O operations.

On top of that I also changed default CompressingSerializer into NativeSerializer which is not doing any compressions that also gives us some noticeable performance boost.

Additionally, during the investigation, I noticed that PSRSimpleCache implementation is not the most optimal way of using PSR16Cache, I will create a dedicated issue for that.

github-actions[bot] commented 7 months ago

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors ```shell +-----------------------+-------------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------+-------------------+------+-----+------------------+------------------+-----------------+ | AvroExtractorBench | bench_extract_10k | 1 | 3 | 35.287mb -0.01% | 848.097ms -1.64% | ±0.65% -64.15% | | CSVExtractorBench | bench_extract_10k | 1 | 3 | 5.007mb -0.04% | 345.182ms +0.21% | ±0.79% +274.66% | | JsonExtractorBench | bench_extract_10k | 1 | 3 | 5.161mb -0.04% | 1.067s -0.37% | ±0.28% -69.73% | | ParquetExtractorBench | bench_extract_10k | 1 | 3 | 135.828mb -0.00% | 935.629ms +2.68% | ±0.94% +424.08% | | TextExtractorBench | bench_extract_10k | 1 | 3 | 4.917mb -0.04% | 35.501ms -2.62% | ±0.57% -66.99% | | XmlExtractorBench | bench_extract_10k | 1 | 3 | 4.923mb -0.04% | 433.433ms -0.27% | ±0.39% +20.29% | +-----------------------+-------------------+------+-----+------------------+------------------+-----------------+ ```
Transformers ```shell +-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+ | RenameEntryTransformerBench | bench_transform_10k_rows | 1 | 3 | 116.228mb -0.00% | 61.715ms +1.62% | ±0.61% -39.79% | +-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+ ```
Loaders ```shell +--------------------+----------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +--------------------+----------------+------+-----+------------------+------------------+-----------------+ | AvroLoaderBench | bench_load_10k | 1 | 3 | 96.672mb -0.00% | 462.610ms +0.61% | ±0.37% +18.73% | | CSVLoaderBench | bench_load_10k | 1 | 3 | 55.148mb -0.00% | 70.891ms +2.13% | ±0.44% -76.53% | | JsonLoaderBench | bench_load_10k | 1 | 3 | 107.581mb -0.00% | 52.443ms +2.87% | ±0.27% +16.95% | | ParquetLoaderBench | bench_load_10k | 1 | 3 | 226.996mb -0.00% | 1.441s +2.19% | ±0.56% +899.26% | | TextLoaderBench | bench_load_10k | 1 | 3 | 17.964mb -0.01% | 40.470ms -1.21% | ±0.17% -85.51% | +--------------------+----------------+------+-----+------------------+------------------+-----------------+ ```
Building Blocks ```shell +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | RowsBench | bench_chunk_10_on_10k | 2 | 3 | 87.050mb +0.00% | 3.750ms +10.25% | ±1.00% -59.52% | | RowsBench | bench_diff_left_1k_on_10k | 2 | 3 | 102.648mb +0.00% | 192.634ms +0.25% | ±0.96% +12.32% | | RowsBench | bench_diff_right_1k_on_10k | 2 | 3 | 85.368mb +0.00% | 19.266ms -0.87% | ±1.64% +268.90% | | RowsBench | bench_drop_1k_on_10k | 2 | 3 | 88.290mb +0.00% | 2.310ms +28.20% | ±3.29% +127.38% | | RowsBench | bench_drop_right_1k_on_10k | 2 | 3 | 88.290mb +0.00% | 1.999ms +13.79% | ±1.74% -10.13% | | RowsBench | bench_entries_on_10k | 2 | 3 | 85.402mb +0.00% | 2.782ms +3.97% | ±3.31% +58.02% | | RowsBench | bench_filter_on_10k | 2 | 3 | 85.931mb +0.00% | 17.452ms +6.93% | ±1.32% -0.74% | | RowsBench | bench_find_on_10k | 2 | 3 | 85.931mb +0.00% | 17.139ms +3.80% | ±0.39% -69.32% | | RowsBench | bench_find_one_on_10k | 10 | 3 | 83.835mb +0.00% | 2.100μs +10.17% | ±0.00% -100.00% | | RowsBench | bench_first_on_10k | 10 | 3 | 83.835mb +0.00% | 0.400μs 0.00% | ±0.00% 0.00% | | RowsBench | bench_flat_map_on_1k | 2 | 3 | 93.185mb +0.00% | 13.097ms +4.13% | ±1.21% +559.95% | | RowsBench | bench_map_on_10k | 2 | 3 | 122.556mb +0.00% | 64.426ms +3.32% | ±0.78% -49.91% | | RowsBench | bench_merge_1k_on_10k | 2 | 3 | 86.451mb +0.00% | 1.677ms +15.94% | ±2.59% +30.13% | | RowsBench | bench_partition_by_on_10k | 2 | 3 | 89.797mb +0.00% | 68.161ms +6.72% | ±0.76% +13.39% | | RowsBench | bench_remove_on_10k | 2 | 3 | 88.552mb +0.00% | 4.128ms +8.32% | ±0.29% -87.56% | | RowsBench | bench_sort_asc_on_1k | 2 | 3 | 83.913mb +0.00% | 41.564ms +2.86% | ±2.02% +23.37% | | RowsBench | bench_sort_by_on_1k | 2 | 3 | 83.914mb +0.00% | 41.802ms +2.80% | ±1.68% -0.12% | | RowsBench | bench_sort_desc_on_1k | 2 | 3 | 83.913mb +0.00% | 42.898ms +7.59% | ±1.06% +338.05% | | RowsBench | bench_sort_entries_on_1k | 2 | 3 | 86.276mb +0.00% | 7.594ms +2.00% | ±3.21% +123.95% | | RowsBench | bench_sort_on_1k | 2 | 3 | 83.835mb +0.00% | 30.367ms +4.03% | ±2.03% +165.67% | | RowsBench | bench_take_1k_on_10k | 10 | 3 | 83.835mb +0.00% | 14.342μs +7.93% | ±2.07% +6.11% | | RowsBench | bench_take_right_1k_on_10k | 10 | 3 | 83.835mb +0.00% | 17.444μs +8.35% | ±2.12% +39.35% | | RowsBench | bench_unique_on_1k | 2 | 3 | 102.649mb +0.00% | 195.050ms +0.31% | ±0.63% -38.60% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 116.727mb +0.00% | 529.626ms +3.89% | ±1.55% +48.03% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 60.205mb +0.00% | 259.248ms +0.77% | ±0.31% -20.15% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 15.140mb +0.00% | 56.686ms +4.58% | ±0.14% -83.26% | | TypeDetectorBench | bench_type_detector | 1 | 3 | 59.960mb +0.00% | 440.040ms +1.51% | ±0.08% -57.07% | | TypeDetectorBench | bench_type_detector | 1 | 3 | 14.499mb +0.00% | 87.494ms -0.22% | ±0.26% -84.31% | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ ```