test: Support for Multi-Level Partition Tables

shamb0 commented 2 months ago

Closes #56

What

Implements a demonstration test for multi-level partition tables, addressing issue #56.

Why

This demonstration showcases the pg_analytics extension's ability to support multi-level partitioned tables. By using Hive-style partitioning for organizing data hierarchically, it enables efficient access to context-specific information, enhancing query performance and scalability.

How

The implementation involves two key components:

Hive-style Partitioning in S3

Organizes data in S3 bucket using a hierarchical structure:

s3://{bucket}/year={year}/manufacturer={manufacturer}/data_{index}.parquet

Implementation in code:

for (i, batch) in partitioned_batches.iter().enumerate() {
 let key = format!(
     "year={}/manufacturer={}/data_{}.parquet",
     year, manufacturer, i
 );
 s3.put_batch(s3_bucket, &key, batch).await?;
}

FOREIGN TABLE Configuration in pg_analytics

Configures a FOREIGN TABLE with options to access the Hive-style partitioned dataset:

CREATE FOREIGN TABLE auto_sales (
 sale_id                 BIGINT,
 sale_date               DATE,
 manufacturer            TEXT,
 model                   TEXT,
 price                   DOUBLE PRECISION,
 dealership_id           INT,
 customer_id             INT,
 year                    INT,
 month                   INT
)
SERVER auto_sales_server
OPTIONS (
 files 's3://{s3_bucket}/year=*/manufacturer=*/data_*.parquet',
 hive_partitioning '1'
);

Tests

To run the demonstration test, use the following command:

RUST_LOG=info \
    cargo test \
    --test \
    test_mlp_auto_sales \
    -- \
    --nocapture

shamb0 commented 2 months ago

Hi @rebasedming,

I'd like to provide an update on the progress of the investigation:

I have refactored the execution hook to handle multi-level partition tables, enabling query pushdown to DuckDB.
I completed a sanity test using the existing auto sales dataset, and everything is working as expected.

Could you please provide feedback on the approach used for handling multi-level partition table queries in the execution hook? Additionally, I would appreciate any suggestions on how it can be further refined.

rebasedming commented 2 months ago

Hi @rebasedming,

I'd like to provide an update on the progress of the investigation:

I have refactored the execution hook to handle multi-level partition tables, enabling query pushdown to DuckDB.

I completed a sanity test using the existing auto sales dataset, and everything is working as expected.

Could you please provide feedback on the approach used for handling multi-level partition table queries in the execution hook? Additionally, I would appreciate any suggestions on how it can be further refined.

Hi @shamb0 -

I admire your perseverance chasing down this issue! It's a tricky one.

I know this is just a draft, but I don't feel good about this implementation. Intercepting and rewriting the query feels pretty unsafe and introduces a lot of complexity to the code base. I'm open to being convinced but I think this is way too much technical overhead for just this one feature.

shamb0 commented 2 months ago

Hi @rebasedming,

Thank you for your thorough review and candid feedback on the PR code patch. I truly appreciate the time you've taken and your openness to further discussion.

Regarding your concerns:

Safety and Complexity:
- I understand your reservations about query interception and SQL statement remapping (from root partition table to foreign table names).
- This approach, while complex, aligns with the core design model of pg_analytics. It's not a new pattern in our system.
Current State and Future Plans:
- The current patch is in an initial prototype state. I acknowledge there's significant work ahead to refine and improve it.
- I plan to explore additional possibilities to address the safety and complexity concerns you've raised.
Moving Forward:
- I'll continue working on this and will update you with any new approaches or improvements.
- Your input is valuable. If you have any specific suggestions or ideas, I'd be grateful to hear them. They would be extremely helpful in building a more robust solution.

I'm looking forward to our continued collaboration on this.

shamb0 commented 2 months ago

Hi @rebasedming,

Firstly, I'd like to acknowledge that my earlier patches may have been overly complex. Thank you for your patience as I worked through this.
I have some good news to share! Today, I discovered the hive_partitioning option in ParquetOption, which I had previously overlooked. After some experimentation, I found that this leads to a much simpler solution than I initially proposed.
Key point: This simpler approach doesn't require any complex intercepting or SQL statement remapping at the executor hook.
Next steps: I'll be cleaning up the code based on this new approach. I expect to have the updated patch ready by tomorrow morning.
For your reference, I've attached a snapshot of the PostgreSQL server trace.

Thank you for your guidance throughout this process. I appreciate your support.

2024-08-30 22:01:28.251 IST [590249] STATEMENT:  drop database if exists "_sqlx_test_640";
2024-08-30 22:01:33.544 IST [590248] LOG:  statement: 
                DROP TABLE IF EXISTS auto_sales CASCADE;

2024-08-30 22:01:33.544 IST [590248] NOTICE:  table "auto_sales" does not exist, skipping
2024-08-30 22:01:33.544 IST [590248] LOG:  statement: 
                DROP SERVER IF EXISTS auto_sales_server CASCADE;

2024-08-30 22:01:33.544 IST [590248] NOTICE:  server "auto_sales_server" does not exist, skipping
2024-08-30 22:01:33.544 IST [590248] LOG:  statement: 
                DROP FOREIGN DATA WRAPPER IF EXISTS parquet_wrapper CASCADE;

2024-08-30 22:01:33.544 IST [590248] NOTICE:  foreign-data wrapper "parquet_wrapper" does not exist, skipping
2024-08-30 22:01:33.545 IST [590248] LOG:  statement: 
                DROP USER MAPPING IF EXISTS FOR public SERVER auto_sales_server;

2024-08-30 22:01:33.545 IST [590248] NOTICE:  server "auto_sales_server" does not exist, skipping
2024-08-30 22:01:33.545 IST [590248] LOG:  statement: CREATE FOREIGN DATA WRAPPER parquet_wrapper
                    HANDLER parquet_fdw_handler
                    VALIDATOR parquet_fdw_validator
2024-08-30 22:01:33.545 IST [590248] WARNING:  pga:: *** ParquetFdw::validator() X ***
2024-08-30 22:01:33.545 IST [590248] WARNING:  pga:: *** ParquetFdw::validator() Y ***
2024-08-30 22:01:33.551 IST [590248] LOG:  statement: CREATE SERVER auto_sales_server
                    FOREIGN DATA WRAPPER parquet_wrapper
2024-08-30 22:01:33.551 IST [590248] WARNING:  pga:: *** ParquetFdw::validator() X ***
2024-08-30 22:01:33.551 IST [590248] WARNING:  pga:: *** ParquetFdw::validator() Y ***
2024-08-30 22:01:33.553 IST [590248] LOG:  statement: CREATE USER MAPPING FOR public
                    SERVER auto_sales_server
                    OPTIONS (
                        type 'S3',
                        region 'us-east-1',
                        endpoint 'localhost:33182',
                        use_ssl 'false',
                        url_style 'path'
                    )
2024-08-30 22:01:33.553 IST [590248] WARNING:  pga:: *** ParquetFdw::validator() X ***
2024-08-30 22:01:33.553 IST [590248] WARNING:  pga:: *** ParquetFdw::validator() Y ***
2024-08-30 22:01:33.556 IST [590248] LOG:  statement: 
                CREATE FOREIGN TABLE auto_sales (
                    sale_id                 BIGINT,
                    sale_date               DATE,
                    manufacturer            TEXT,
                    model                   TEXT,
                    price                   DOUBLE PRECISION,
                    dealership_id           INT,
                    customer_id             INT,
                    year                    INT,
                    month                   INT
                )
                SERVER auto_sales_server
                OPTIONS (
                    files 's3://demo-mlp-auto-sales/year=*/manufacturer=*/data_*.parquet',
                    hive_partitioning '1'
                );

...

2024-08-30 22:01:33.606 IST [590248] WARNING:  pga:: *** fdw::register_duckdb_view() Y ***
2024-08-30 22:01:33.606 IST [590248] LOG:  execute sqlx_s_2: 
                SELECT year, manufacturer, ROUND(SUM(price)::numeric, 4)::float8 as total_sales
                FROM auto_sales
                WHERE year BETWEEN 2020 AND 2024
                GROUP BY year, manufacturer
                ORDER BY year, total_sales DESC;

2024-08-30 22:01:33.606 IST [590248] WARNING:  pga:: *** executor_run() X ***
2024-08-30 22:01:33.606 IST [590248] WARNING:  pga:: *** ExtensionHook::executor_run() X ***
2024-08-30 22:01:33.606 IST [590248] WARNING:  pga:: *** get_current_query() X ***
2024-08-30 22:01:33.606 IST [590248] WARNING:  pga:: *** get_current_query() "\n            SELECT year, manufacturer, ROUND(SUM(price)::numeric, 4)::float8 as total_sales\n            FROM auto_sales\n            WHERE year BETWEEN 2020 AND 2024\n            GROUP BY year, manufacturer\n            ORDER BY year, total_sales DESC" ***
2024-08-30 22:01:33.606 IST [590248] WARNING:  pga:: *** get_query_relations() X ***
2024-08-30 22:01:33.606 IST [590248] WARNING:  pga:: *** get_query_relations() 1 ***
2024-08-30 22:01:33.606 IST [590248] WARNING:  query_relations.is_empty() :: false
2024-08-30 22:01:33.606 IST [590248] WARNING:  pga:: *** duckdb::create_arrow() "\n            SELECT year, manufacturer, ROUND(SUM(price)::numeric, 4)::float8 as total_sales\n            FROM auto_sales\n            WHERE year BETWEEN 2020 AND 2024\n            GROUP BY year, manufacturer\n            ORDER BY year, total_sales DESC" ***
2024-08-30 22:01:34.091 IST [590248] WARNING:  pga:: *** duckdb::create_arrow() Y ***
2024-08-30 22:01:34.091 IST [590248] WARNING:  pga:: *** ExtensionHook::write_batches_to_slots() X 4 ***
2024-08-30 22:01:34.091 IST [590248] WARNING:  pga:: *** ExtensionHook::write_batches_to_slots() Y ***
2024-08-30 22:01:34.091 IST [590248] WARNING:  pga:: *** ExtensionHook::executor_run() Y ***
2024-08-30 22:01:34.091 IST [590248] WARNING:  pga:: *** executor_run() Y ***
2024-08-30 22:06:23.245 IST [61966] LOG:  checkpoint starting: time
2024-08-30 22:07:58.164 IST [61966] LOG:  checkpoint complete: wrote 948 buffers (5.8%); 0 WAL file(s) added, 0 removed, 0 recycled; write=94.872 s, sync=0.024 s, total=94.920 s; sync files=304, longest=0.006 s, average=0.001 s; distance=4300 kB, estimate=4306 kB; lsn=0/955A6278, redo lsn=0/955A6240

shamb0 commented 2 months ago

Hi @philippemnoel,

I have resolved the code conflicts, and the branch should now be ready for intake review and potential merge. :)

philippemnoel commented 2 months ago

Hi @philippemnoel,

I have resolved the code conflicts, and the branch should now be ready for intake review and potential merge. :)

Thank you! I’m personally very excited for this iteration, it feels much better. We’re pretty busy with a few big features right now, but we’ll review ASAP and I’m hopeful we can get this merged next week. @rebasedming is the final authority on any merge, so whenever he has some time he’ll take a look!

shamb0 commented 2 months ago

@Weijun-H, thanks for your feedback! I’ve addressed all your comments. Let me know if there’s anything else to improve.

shamb0 commented 1 month ago

Hi @philippemnoel,

Thank you for the review comments, I really appreciate it. Moving forward, I'll aim for smaller PRs with minimal changes to make them easier to review.

paradedb / pg_analytics

test: Support for Multi-Level Partition Tables #115

What

Why

How

Tests