flow-php / flow

Flow PHP - data processing framework
https://flow-php.com
MIT License
491 stars 28 forks source link

Improved performance of Scalar Function Parameter #1223

Closed norberttech closed 2 months ago

norberttech commented 2 months ago

Change Log

Added

Fixed

Changed

  • Improved performance of Scalar Function Parameter

Removed

Deprecated

Security


Description

That is another round of improving the performance of processing XML files. Parameter::asAnyOf was replaced with a more generic Parameter::as that is using Flow Type. The main difference is in how those two functions are executed, Paraeter::as evaluates scalar function only once, which in processing big datasets might reduce the execution of the same scalar function by a million times.

github-actions[bot] commented 2 months ago

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors ```shell +-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+ | CSVExtractorBench | bench_extract_10k | 1 | 3 | 4.540mb +0.01% | 507.059ms -0.15% | ±2.36% +108.50% | | JsonExtractorBench | bench_extract_10k | 1 | 3 | 4.655mb +0.01% | 1.065s -1.10% | ±0.39% -17.52% | | ParquetExtractorBench | bench_extract_10k | 1 | 3 | 29.111mb +0.00% | 426.267ms -0.83% | ±0.67% -74.80% | | TextExtractorBench | bench_extract_10k | 1 | 3 | 4.297mb +0.01% | 33.616ms +1.74% | ±0.48% -71.53% | | XmlExtractorBench | bench_extract_10k | 1 | 3 | 4.278mb +0.01% | 661.492ms -1.13% | ±1.66% +15.89% | +-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+ ```
Transformers ```shell +-----------------------------+--------------------------+------+-----+------------------+-----------------+---------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------------+--------------------------+------+-----+------------------+-----------------+---------------+ | RenameEntryTransformerBench | bench_transform_10k_rows | 1 | 3 | 116.573mb +0.00% | 59.729ms -1.56% | ±1.07% +6.00% | +-----------------------------+--------------------------+------+-----+------------------+-----------------+---------------+ ```
Loaders ```shell +--------------------+----------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +--------------------+----------------+------+-----+------------------+------------------+-----------------+ | CSVLoaderBench | bench_load_10k | 1 | 3 | 54.738mb +0.00% | 137.925ms -1.02% | ±0.37% +656.20% | | JsonLoaderBench | bench_load_10k | 1 | 3 | 90.347mb +0.00% | 117.479ms -0.91% | ±0.82% -29.00% | | ParquetLoaderBench | bench_load_10k | 1 | 3 | 124.466mb +0.00% | 1.221s -2.80% | ±0.40% -23.22% | | TextLoaderBench | bench_load_10k | 1 | 3 | 17.488mb +0.00% | 43.503ms -3.91% | ±0.26% -81.17% | +--------------------+----------------+------+-----+------------------+------------------+-----------------+ ```
Building Blocks ```shell +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 107.416mb +0.00% | 466.006ms -0.81% | ±0.17% -83.66% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 55.774mb +0.00% | 240.472ms +0.33% | ±0.79% -32.79% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 14.612mb +0.00% | 50.296ms +0.17% | ±1.27% -49.68% | | RowsBench | bench_chunk_10_on_10k | 2 | 3 | 87.329mb +0.00% | 3.409ms -0.14% | ±3.14% +107.87% | | RowsBench | bench_diff_left_1k_on_10k | 2 | 3 | 102.933mb +0.00% | 188.791ms -0.14% | ±0.32% -73.63% | | RowsBench | bench_diff_right_1k_on_10k | 2 | 3 | 85.653mb +0.00% | 18.674ms -2.05% | ±1.26% +277.14% | | RowsBench | bench_drop_1k_on_10k | 2 | 3 | 88.569mb +0.00% | 1.734ms -3.32% | ±0.45% -78.83% | | RowsBench | bench_drop_right_1k_on_10k | 2 | 3 | 88.569mb +0.00% | 1.736ms -1.90% | ±1.32% +59.74% | | RowsBench | bench_entries_on_10k | 2 | 3 | 85.681mb +0.00% | 2.807ms -1.88% | ±0.70% +2.23% | | RowsBench | bench_filter_on_10k | 2 | 3 | 86.210mb +0.00% | 16.079ms -3.83% | ±1.22% +32.57% | | RowsBench | bench_find_on_10k | 2 | 3 | 86.210mb +0.00% | 16.241ms -4.61% | ±0.61% -50.17% | | RowsBench | bench_find_one_on_10k | 10 | 3 | 84.114mb +0.00% | 1.606μs -10.77% | ±2.89% +0.00% | | RowsBench | bench_first_on_10k | 10 | 3 | 84.114mb +0.00% | 0.400μs 0.00% | ±0.00% 0.00% | | RowsBench | bench_flat_map_on_1k | 2 | 3 | 93.464mb +0.00% | 12.234ms -0.79% | ±1.33% -40.31% | | RowsBench | bench_map_on_10k | 2 | 3 | 122.835mb +0.00% | 61.811ms -1.36% | ±0.67% +62.01% | | RowsBench | bench_merge_1k_on_10k | 2 | 3 | 86.730mb +0.00% | 1.442ms -21.87% | ±0.91% -27.30% | | RowsBench | bench_partition_by_on_10k | 2 | 3 | 90.086mb +0.00% | 58.938ms -3.26% | ±1.22% -9.22% | | RowsBench | bench_remove_on_10k | 2 | 3 | 88.832mb +0.00% | 4.215ms -3.58% | ±0.27% -92.32% | | RowsBench | bench_sort_asc_on_1k | 2 | 3 | 84.264mb +0.00% | 41.063ms +2.08% | ±0.88% +4.75% | | RowsBench | bench_sort_by_on_1k | 2 | 3 | 84.265mb +0.00% | 39.710ms -1.22% | ±0.53% -41.73% | | RowsBench | bench_sort_desc_on_1k | 2 | 3 | 84.264mb +0.00% | 40.114ms +1.91% | ±1.00% +12.18% | | RowsBench | bench_sort_entries_on_1k | 2 | 3 | 86.556mb +0.00% | 7.373ms -0.61% | ±0.81% -31.35% | | RowsBench | bench_sort_on_1k | 2 | 3 | 84.114mb +0.00% | 28.419ms -0.03% | ±0.33% -11.89% | | RowsBench | bench_take_1k_on_10k | 10 | 3 | 84.114mb +0.00% | 13.576μs -6.38% | ±1.40% +148.66% | | RowsBench | bench_take_right_1k_on_10k | 10 | 3 | 84.114mb +0.00% | 15.679μs -8.51% | ±0.79% -63.35% | | RowsBench | bench_unique_on_1k | 2 | 3 | 102.934mb +0.00% | 188.112ms -0.94% | ±1.03% +92.05% | | TypeDetectorBench | bench_type_detector | 1 | 3 | 53.219mb +0.00% | 391.315ms +0.11% | ±3.34% +615.86% | | TypeDetectorBench | bench_type_detector | 1 | 3 | 13.485mb +0.00% | 78.665ms -1.06% | ±3.50% -1.28% | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ ```