flow-php / flow

Flow PHP - data processing framework
https://flow-php.com
MIT License
470 stars 26 forks source link

Added support for parquet multifile pagination #1050

Closed norberttech closed 5 months ago

norberttech commented 5 months ago

Change Log

Added

  • Added support for parquet multifile pagination

Fixed

Changed

Removed

Deprecated

Security


Description

github-actions[bot] commented 5 months ago

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors ```shell +-----------------------+-------------------+------+-----+------------------+------------------+----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------+-------------------+------+-----+------------------+------------------+----------------+ | AvroExtractorBench | bench_extract_10k | 1 | 3 | 35.299mb +0.00% | 860.683ms +0.27% | ±0.42% +21.18% | | CSVExtractorBench | bench_extract_10k | 1 | 3 | 5.019mb +0.01% | 345.554ms +0.24% | ±0.08% -62.96% | | JsonExtractorBench | bench_extract_10k | 1 | 3 | 5.174mb +0.01% | 1.073s -1.16% | ±0.50% -47.33% | | ParquetExtractorBench | bench_extract_10k | 1 | 3 | 135.842mb +0.00% | 924.072ms -0.27% | ±0.87% -69.60% | | TextExtractorBench | bench_extract_10k | 1 | 3 | 4.928mb +0.01% | 35.626ms -1.29% | ±0.24% -71.64% | | XmlExtractorBench | bench_extract_10k | 1 | 3 | 4.934mb +0.01% | 430.158ms -1.57% | ±0.65% -74.23% | +-----------------------+-------------------+------+-----+------------------+------------------+----------------+ ```
Transformers ```shell +-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+ | RenameEntryTransformerBench | bench_transform_10k_rows | 1 | 3 | 116.240mb +0.00% | 61.240ms -1.85% | ±0.47% -68.35% | +-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+ ```
Loaders ```shell +--------------------+----------------+------+-----+------------------+------------------+----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +--------------------+----------------+------+-----+------------------+------------------+----------------+ | AvroLoaderBench | bench_load_10k | 1 | 3 | 96.685mb +0.00% | 473.590ms +1.08% | ±1.36% +22.18% | | CSVLoaderBench | bench_load_10k | 1 | 3 | 55.164mb +0.00% | 68.593ms +0.18% | ±0.46% +47.75% | | JsonLoaderBench | bench_load_10k | 1 | 3 | 107.594mb +0.00% | 52.591ms +2.13% | ±0.62% -34.96% | | ParquetLoaderBench | bench_load_10k | 1 | 3 | 227.010mb +0.00% | 1.444s +1.06% | ±1.04% +20.08% | | TextLoaderBench | bench_load_10k | 1 | 3 | 17.976mb +0.00% | 38.544ms -0.85% | ±0.43% -50.29% | +--------------------+----------------+------+-----+------------------+------------------+----------------+ ```
Building Blocks ```shell +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | RowsBench | bench_chunk_10_on_10k | 2 | 3 | 87.060mb +0.00% | 3.455ms -7.85% | ±2.15% -4.87% | | RowsBench | bench_diff_left_1k_on_10k | 2 | 3 | 102.658mb +0.00% | 189.113ms -1.98% | ±0.99% +165.54% | | RowsBench | bench_diff_right_1k_on_10k | 2 | 3 | 85.378mb +0.00% | 18.897ms -1.85% | ±0.48% +35.08% | | RowsBench | bench_drop_1k_on_10k | 2 | 3 | 88.300mb +0.00% | 1.717ms -8.47% | ±1.19% -63.61% | | RowsBench | bench_drop_right_1k_on_10k | 2 | 3 | 88.300mb +0.00% | 1.918ms +4.79% | ±3.25% +186.91% | | RowsBench | bench_entries_on_10k | 2 | 3 | 85.412mb +0.00% | 2.855ms -1.37% | ±2.31% -15.50% | | RowsBench | bench_filter_on_10k | 2 | 3 | 85.941mb +0.00% | 18.115ms -1.57% | ±0.57% -73.67% | | RowsBench | bench_find_on_10k | 2 | 3 | 85.941mb +0.00% | 17.907ms -1.86% | ±1.37% +206.62% | | RowsBench | bench_find_one_on_10k | 10 | 3 | 83.846mb +0.00% | 1.894μs -5.30% | ±2.53% +0.00% | | RowsBench | bench_first_on_10k | 10 | 3 | 83.846mb +0.00% | 0.500μs +25.00% | ±0.00% -100.00% | | RowsBench | bench_flat_map_on_1k | 2 | 3 | 93.195mb +0.00% | 13.409ms -0.99% | ±1.91% -10.70% | | RowsBench | bench_map_on_10k | 2 | 3 | 122.566mb +0.00% | 63.589ms -0.98% | ±0.38% -41.76% | | RowsBench | bench_merge_1k_on_10k | 2 | 3 | 86.461mb +0.00% | 1.641ms +6.65% | ±1.08% -67.44% | | RowsBench | bench_partition_by_on_10k | 2 | 3 | 89.808mb +0.00% | 67.614ms -0.95% | ±1.11% -20.22% | | RowsBench | bench_remove_on_10k | 2 | 3 | 88.562mb +0.00% | 4.170ms +1.42% | ±1.02% -5.07% | | RowsBench | bench_sort_asc_on_1k | 2 | 3 | 83.989mb +0.00% | 40.400ms -4.05% | ±1.73% +22.37% | | RowsBench | bench_sort_by_on_1k | 2 | 3 | 83.990mb +0.00% | 40.851ms +0.98% | ±1.99% -26.55% | | RowsBench | bench_sort_desc_on_1k | 2 | 3 | 83.989mb +0.00% | 41.816ms +3.38% | ±3.00% +183.34% | | RowsBench | bench_sort_entries_on_1k | 2 | 3 | 86.287mb +0.00% | 7.602ms +0.91% | ±1.36% +555.94% | | RowsBench | bench_sort_on_1k | 2 | 3 | 83.846mb +0.00% | 29.034ms -0.59% | ±0.42% -83.44% | | RowsBench | bench_take_1k_on_10k | 10 | 3 | 83.846mb +0.00% | 14.312μs +2.38% | ±0.66% -26.16% | | RowsBench | bench_take_right_1k_on_10k | 10 | 3 | 83.846mb +0.00% | 16.138μs -4.41% | ±2.30% +172.69% | | RowsBench | bench_unique_on_1k | 2 | 3 | 102.660mb +0.00% | 193.554ms -1.33% | ±0.49% +63.89% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 116.738mb +0.00% | 540.177ms +3.07% | ±2.12% +959.80% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 60.216mb +0.00% | 257.623ms -0.65% | ±0.97% -14.02% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 15.150mb +0.00% | 57.444ms +1.80% | ±1.76% +143.78% | | TypeDetectorBench | bench_type_detector | 1 | 3 | 59.970mb +0.00% | 447.107ms +2.96% | ±0.66% -28.65% | | TypeDetectorBench | bench_type_detector | 1 | 3 | 14.509mb +0.00% | 88.622ms +0.53% | ±0.59% -34.80% | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ ```