flow-php / flow

Flow PHP - data processing framework
https://flow-php.com
MIT License
404 stars 23 forks source link

Fixed reading required columns from parquet files #1049

Closed norberttech closed 2 months ago

norberttech commented 2 months ago

Change Log

Added

Fixed

Changed

  • Flysystem scan will sort files by file name

Removed

Deprecated

Security


Description

github-actions[bot] commented 2 months ago

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors ```shell +-----------------------+-------------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------+-------------------+------+-----+------------------+------------------+-----------------+ | AvroExtractorBench | bench_extract_10k | 1 | 3 | 35.298mb +0.00% | 855.853ms -0.97% | ±0.59% -43.07% | | CSVExtractorBench | bench_extract_10k | 1 | 3 | 5.018mb +0.02% | 344.496ms -0.06% | ±0.29% -8.26% | | JsonExtractorBench | bench_extract_10k | 1 | 3 | 5.173mb +0.02% | 1.084s -0.42% | ±3.49% +80.04% | | ParquetExtractorBench | bench_extract_10k | 1 | 3 | 135.840mb +0.00% | 908.145ms -1.91% | ±0.21% -80.90% | | TextExtractorBench | bench_extract_10k | 1 | 3 | 4.928mb +0.02% | 36.342ms +1.36% | ±0.15% -67.67% | | XmlExtractorBench | bench_extract_10k | 1 | 3 | 4.934mb +0.02% | 451.669ms +4.16% | ±2.91% +504.23% | +-----------------------+-------------------+------+-----+------------------+------------------+-----------------+ ```
Transformers ```shell +-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+ | RenameEntryTransformerBench | bench_transform_10k_rows | 1 | 3 | 116.239mb +0.00% | 61.114ms -1.02% | ±1.75% +51.56% | +-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+ ```
Loaders ```shell +--------------------+----------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +--------------------+----------------+------+-----+------------------+------------------+-----------------+ | AvroLoaderBench | bench_load_10k | 1 | 3 | 96.684mb +0.00% | 466.352ms -0.86% | ±0.52% -85.25% | | CSVLoaderBench | bench_load_10k | 1 | 3 | 55.163mb +0.00% | 70.024ms +1.01% | ±0.56% -23.84% | | JsonLoaderBench | bench_load_10k | 1 | 3 | 107.593mb +0.00% | 51.071ms -2.05% | ±0.55% -77.71% | | ParquetLoaderBench | bench_load_10k | 1 | 3 | 227.007mb +0.00% | 1.410s -1.42% | ±1.48% +173.17% | | TextLoaderBench | bench_load_10k | 1 | 3 | 17.976mb +0.01% | 38.670ms -2.33% | ±0.96% -0.36% | +--------------------+----------------+------+-----+------------------+------------------+-----------------+ ```
Building Blocks ```shell +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | RowsBench | bench_chunk_10_on_10k | 2 | 3 | 87.060mb +0.00% | 3.657ms +2.86% | ±3.59% +70.58% | | RowsBench | bench_diff_left_1k_on_10k | 2 | 3 | 102.658mb +0.00% | 189.530ms +0.71% | ±0.24% -68.43% | | RowsBench | bench_diff_right_1k_on_10k | 2 | 3 | 85.378mb +0.00% | 18.959ms +0.51% | ±1.08% +0.46% | | RowsBench | bench_drop_1k_on_10k | 2 | 3 | 88.300mb +0.00% | 2.001ms +7.35% | ±3.49% +111.23% | | RowsBench | bench_drop_right_1k_on_10k | 2 | 3 | 88.300mb +0.00% | 1.858ms -6.21% | ±1.54% +7.08% | | RowsBench | bench_entries_on_10k | 2 | 3 | 85.412mb +0.00% | 2.698ms -2.79% | ±2.04% +141.78% | | RowsBench | bench_filter_on_10k | 2 | 3 | 85.941mb +0.00% | 16.438ms -1.12% | ±0.39% -68.69% | | RowsBench | bench_find_on_10k | 2 | 3 | 85.941mb +0.00% | 16.418ms -1.73% | ±2.12% +58.60% | | RowsBench | bench_find_one_on_10k | 10 | 3 | 83.845mb +0.00% | 1.906μs +5.89% | ±2.44% +0.00% | | RowsBench | bench_first_on_10k | 10 | 3 | 83.845mb +0.00% | 0.400μs 0.00% | ±0.00% 0.00% | | RowsBench | bench_flat_map_on_1k | 2 | 3 | 93.195mb +0.00% | 12.632ms -5.20% | ±0.76% -72.49% | | RowsBench | bench_map_on_10k | 2 | 3 | 122.566mb +0.00% | 61.908ms -0.82% | ±1.34% +4.40% | | RowsBench | bench_merge_1k_on_10k | 2 | 3 | 86.460mb +0.00% | 1.291ms -16.07% | ±1.82% -12.59% | | RowsBench | bench_partition_by_on_10k | 2 | 3 | 89.807mb +0.00% | 64.946ms -2.25% | ±0.94% -31.64% | | RowsBench | bench_remove_on_10k | 2 | 3 | 88.562mb +0.00% | 4.412ms +7.87% | ±0.88% -29.86% | | RowsBench | bench_sort_asc_on_1k | 2 | 3 | 83.988mb +0.00% | 40.500ms -1.41% | ±0.53% -49.30% | | RowsBench | bench_sort_by_on_1k | 2 | 3 | 83.989mb +0.00% | 41.026ms +0.32% | ±1.25% +233.22% | | RowsBench | bench_sort_desc_on_1k | 2 | 3 | 83.988mb +0.00% | 41.012ms -1.89% | ±1.79% +306.76% | | RowsBench | bench_sort_entries_on_1k | 2 | 3 | 86.286mb +0.00% | 7.448ms -1.39% | ±1.03% -49.43% | | RowsBench | bench_sort_on_1k | 2 | 3 | 83.845mb +0.00% | 29.703ms +1.00% | ±1.13% +98.02% | | RowsBench | bench_take_1k_on_10k | 10 | 3 | 83.845mb +0.00% | 14.144μs +0.99% | ±2.62% +679.17% | | RowsBench | bench_take_right_1k_on_10k | 10 | 3 | 83.845mb +0.00% | 16.380μs -1.40% | ±1.03% +82.47% | | RowsBench | bench_unique_on_1k | 2 | 3 | 102.659mb +0.00% | 194.487ms -0.36% | ±0.78% -26.66% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 116.737mb +0.00% | 519.806ms -0.20% | ±0.27% +9.41% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 60.215mb +0.00% | 264.013ms +0.53% | ±0.55% -24.38% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 15.150mb +0.01% | 56.830ms +0.28% | ±0.75% -64.07% | | TypeDetectorBench | bench_type_detector | 1 | 3 | 59.969mb +0.00% | 440.632ms +0.62% | ±0.99% +179.40% | | TypeDetectorBench | bench_type_detector | 1 | 3 | 14.508mb +0.01% | 87.658ms -0.46% | ±0.16% -65.80% | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ ```