Closed sdressler closed 3 years ago
Also happens with an ORDER BY
:
foo=# explain analyze select id, value from test order by id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
-
Gather Merge (cost=5.10..5.24 rows=12 width=13) (actual time=27.599..31.396 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=5.08..5.09 rows=6 width=13) (actual time=0.075..0.076 rows=0 loops=3)
Sort Key: id
Sort Method: quicksort Memory: 25kB
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
-> Parallel Foreign Scan on test (cost=0.00..5.00 rows=6 width=13) (actual time=0.002..0.003 rows=0 loops=3)
Reader: Single File
Row groups: 1
Planning Time: 0.851 ms
Execution Time: 31.638 ms
(13 rows)
Time: 33.196 ms
Found so far, that it breaks in parquet_impl.cpp
at read_next_rowgroup
. More precisely here:
if (this->reader_id != (coord->next_reader - 1))
return false;
Since both, reader_id
and next_reader
are 0.
I might be wrong, but I think the above needs to become:
if (this->row_group != (coord->next_rowgroup - 1))
return false;
Rationale: the single-file fdw plan has only one reader per worker. Parallelism should be achieved via processing row-groups in parallel.
Hi @sdressler,
Thank you for reporting the issue. I pushed bugfix into master. Can you pls check if it works for you?
Yep, looks good to me. Thanks!
Description
After running
ANALYZE
on the FDW backed tables and enabling parallel querying, the query stops executing properly and always returns zero rows. Disabling parallelism causes the query to work.Without parallelism enabled:
With parallelism enabled:
Reproduction
Observed Result
Expected Result