adjust / parquet_fdw

Parquet foreign data wrapper for PostgreSQL
PostgreSQL License
333 stars 37 forks source link

Why aren't we filtering rows by supplied filters in IterateForeignScan? #87

Open sushrut141 opened 2 weeks ago

sushrut141 commented 2 weeks ago

Came across this comment where the foreign scan is created.

We have no native ability to evaluate restriction clauses

Why is this the case? The ParquetReaders seem to be reading rows disregarding any applied filters. The filters can be used to avoid returning rows that do not match the WHERE clauses in the query.

Why are all rows being returned as is? Will the planner re-run the filters on the returned rows?

sushrut141 commented 2 weeks ago

Hey, I would like to add this filtering support if you're open to contributions. Can you please clarify if there was some reasoning behind not adding it initially? Thanks

EDIT: Found the info in the docs. https://www.postgresql.org/docs/16/fdw-planning.html

 In simple cases the FDW can just strip RestrictInfo nodes from the scan_clauses list (using extract_actual_clauses) and put all the clauses into the plan node's qual list, which means that all the clauses will be checked by the executor at run time. More complex FDWs may be able to check some of the clauses internally, in which case those clauses can be removed from the plan node's qual list so that the executor doesn't waste time rechecking them.