flow-php / flow

Flow PHP - data processing framework
https://flow-php.com
MIT License
491 stars 28 forks source link

Added StdOut Filesystem with stdout protocol #1233

Closed norberttech closed 2 months ago

norberttech commented 2 months ago

Change Log

Added

  • Added StdOut Filesystem with stdout protocol

Fixed

Changed

  • Loaders no longer need to relay on file extension to close streams in clousure

Removed

Deprecated

Security


Description

Example:

<?php

use function Flow\ETL\Adapter\JSON\to_json;
use function Flow\ETL\DSL\df;
use function Flow\ETL\DSL\from_array;
use function Flow\Filesystem\DSL\path_stdout;

require __DIR__ . '/../../../vendor/autoload.php';

df()
    ->read(from_array([
        ['id' => 1, 'name' => 'Product 1', 'aw_product_id' => 100],
        ['id' => 2, 'name' => 'Product 2', 'aw_product_id' => 200],
        ['id' => 3, 'name' => 'Product 3', 'aw_product_id' => 300],
    ]))
    ->batchSize(1)
    ->write(to_json(path_stdout())->withFlags(JSON_PRETTY_PRINT))
    ->run();

Terminal output:

[{
    "id": 1,
    "name": "Product 1",
    "aw_product_id": 100
},{
    "id": 2,
    "name": "Product 2",
    "aw_product_id": 200
},{
    "id": 3,
    "name": "Product 3",
    "aw_product_id": 300
}]
github-actions[bot] commented 2 months ago

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors ```shell +-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+ | CSVExtractorBench | bench_extract_10k | 1 | 3 | 4.546mb +0.28% | 511.670ms -0.03% | ±0.43% -5.85% | | JsonExtractorBench | bench_extract_10k | 1 | 3 | 4.660mb +0.27% | 1.080s -1.64% | ±0.96% +5.14% | | ParquetExtractorBench | bench_extract_10k | 1 | 3 | 29.117mb +0.04% | 429.776ms -0.54% | ±1.16% +102.92% | | TextExtractorBench | bench_extract_10k | 1 | 3 | 4.302mb +0.30% | 33.105ms +0.16% | ±0.24% -77.48% | | XmlExtractorBench | bench_extract_10k | 1 | 3 | 4.284mb +0.30% | 661.600ms -3.90% | ±0.97% -47.88% | +-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+ ```
Transformers ```shell +-----------------------------+--------------------------+------+-----+------------------+-----------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-----------------------------+--------------------------+------+-----+------------------+-----------------+-----------------+ | RenameEntryTransformerBench | bench_transform_10k_rows | 1 | 3 | 116.578mb +0.01% | 60.576ms -0.66% | ±2.64% +355.28% | +-----------------------------+--------------------------+------+-----+------------------+-----------------+-----------------+ ```
Loaders ```shell +--------------------+----------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +--------------------+----------------+------+-----+------------------+------------------+-----------------+ | CSVLoaderBench | bench_load_10k | 1 | 3 | 54.744mb +0.02% | 123.073ms +0.00% | ±0.63% -45.87% | | JsonLoaderBench | bench_load_10k | 1 | 3 | 90.353mb +0.01% | 102.368ms +0.46% | ±1.73% +72.96% | | ParquetLoaderBench | bench_load_10k | 1 | 3 | 124.471mb +0.01% | 1.243s -0.79% | ±0.47% +136.07% | | TextLoaderBench | bench_load_10k | 1 | 3 | 17.493mb +0.07% | 29.761ms +0.47% | ±1.19% -3.41% | +--------------------+----------------+------+-----+------------------+------------------+-----------------+ ```
Building Blocks ```shell +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | benchmark | subject | revs | its | mem_peak | mode | rstdev | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ | TypeDetectorBench | bench_type_detector | 1 | 3 | 53.225mb +0.01% | 393.398ms -0.47% | ±0.86% +53.69% | | TypeDetectorBench | bench_type_detector | 1 | 3 | 13.490mb +0.04% | 79.926ms -0.36% | ±0.23% -68.70% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 107.421mb +0.00% | 471.108ms +0.90% | ±0.43% -39.38% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 55.780mb +0.01% | 240.118ms +1.58% | ±0.87% +46.80% | | NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 14.618mb +0.03% | 51.848ms -1.82% | ±0.63% -34.94% | | RowsBench | bench_chunk_10_on_10k | 2 | 3 | 87.335mb +0.08% | 3.661ms -3.38% | ±2.13% -40.50% | | RowsBench | bench_diff_left_1k_on_10k | 2 | 3 | 102.939mb +0.00% | 189.970ms -0.25% | ±1.13% +19.72% | | RowsBench | bench_diff_right_1k_on_10k | 2 | 3 | 85.659mb +0.01% | 18.943ms -1.22% | ±1.00% +100.96% | | RowsBench | bench_drop_1k_on_10k | 2 | 3 | 88.575mb +0.08% | 2.295ms +15.05% | ±1.84% +18.87% | | RowsBench | bench_drop_right_1k_on_10k | 2 | 3 | 88.575mb +0.08% | 2.256ms +14.63% | ±2.26% -16.74% | | RowsBench | bench_entries_on_10k | 2 | 3 | 85.687mb +0.08% | 3.213ms +3.83% | ±1.06% -16.66% | | RowsBench | bench_filter_on_10k | 2 | 3 | 86.281mb +0.01% | 16.875ms -3.23% | ±0.46% -81.49% | | RowsBench | bench_find_on_10k | 2 | 3 | 86.281mb +0.01% | 17.292ms -4.14% | ±1.20% -50.54% | | RowsBench | bench_find_one_on_10k | 10 | 3 | 84.120mb +0.08% | 1.800μs 0.00% | ±0.00% 0.00% | | RowsBench | bench_first_on_10k | 10 | 3 | 84.120mb +0.08% | 0.400μs 0.00% | ±0.00% 0.00% | | RowsBench | bench_flat_map_on_1k | 2 | 3 | 93.535mb +0.01% | 13.352ms +3.06% | ±2.76% +59.21% | | RowsBench | bench_map_on_10k | 2 | 3 | 122.906mb +0.00% | 63.023ms +1.26% | ±1.22% +325.46% | | RowsBench | bench_merge_1k_on_10k | 2 | 3 | 86.801mb +0.01% | 1.736ms -4.83% | ±1.32% -27.64% | | RowsBench | bench_partition_by_on_10k | 2 | 3 | 90.092mb +0.01% | 60.361ms +1.29% | ±1.12% +190.60% | | RowsBench | bench_remove_on_10k | 2 | 3 | 88.837mb +0.08% | 4.770ms +2.53% | ±0.67% -73.98% | | RowsBench | bench_sort_asc_on_1k | 2 | 3 | 84.269mb +0.01% | 41.929ms +0.81% | ±3.15% +468.41% | | RowsBench | bench_sort_by_on_1k | 2 | 3 | 84.270mb +0.01% | 39.641ms -1.60% | ±1.59% -6.99% | | RowsBench | bench_sort_desc_on_1k | 2 | 3 | 84.269mb +0.01% | 40.708ms +0.65% | ±0.97% -40.79% | | RowsBench | bench_sort_entries_on_1k | 2 | 3 | 86.561mb +0.08% | 7.431ms -0.55% | ±2.62% +798.16% | | RowsBench | bench_sort_on_1k | 2 | 3 | 84.184mb +0.01% | 28.886ms -0.72% | ±1.41% +82.89% | | RowsBench | bench_take_1k_on_10k | 10 | 3 | 84.120mb +0.08% | 14.983μs +6.70% | ±3.38% +45.04% | | RowsBench | bench_take_right_1k_on_10k | 10 | 3 | 84.120mb +0.08% | 16.381μs -0.19% | ±3.42% +497.75% | | RowsBench | bench_unique_on_1k | 2 | 3 | 102.940mb +0.00% | 191.701ms -1.55% | ±0.30% -5.48% | +-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+ ```