adjust / parquet_fdw

Parquet foreign data wrapper for PostgreSQL
PostgreSQL License
352 stars 38 forks source link

unrecognized node type when selecting #2

Closed ch-peterspiro closed 1 year ago

ch-peterspiro commented 5 years ago

Hello, I've successfully installed the extension on PG 10.6, and followed the instructions in README, including creating a foreign table with a schema definition that I believe should be consistent with the contents of the parquet file it's defined for.

However, when I try to select from the table, I get: [XX000] ERROR: unrecognized node type: -1342623984

I've tried this with different parquet files (and so different table definitions), but all that changes is the number at the end of the error message.

Would you be able to supply an example parquet file and a "create foreign table" statement that works for that file, along with the version of Postgres for which this works for you?

Thanks!

zilder commented 5 years ago

Hi @ch-peterspiro,

there is an example.parquet file in data subdirectory of the repo. I had similar error recently which was related to the plan caching; I pushed a bugfix yesterday. Can you try with the latest version?

zilder commented 5 years ago

And pg version is 10.4. The full example you can find in the regression tests

ch-peterspiro commented 5 years ago

Great thanks. Turns out to work fine for 10.4 for me, but not for 10.6.

zilder commented 5 years ago

Hi @ch-peterspiro,

I could not reproduce this issue neither in 10.6 nor 11.1. Do you have other extensions installed? Can you provide an example of your parquet file with its schema and sample query?

ch-peterspiro commented 5 years ago

Yes, I also have plpgsql installed. The example I'm using is the one from https://github.com/zilder/parquet_fdw/blob/master/input/parquet_fdw.source / https://github.com/zilder/parquet_fdw/blob/master/data/example.parquet.

zilder commented 5 years ago

Is there a chance that you have a non-standard postgres version? Have you installed it from packages? And if yes which OS distribution do you use?

ch-peterspiro commented 5 years ago

I'm actually using the Postgres docker containers at: https://hub.docker.com/_/postgres/

So I haven't installed Postgres itself, just started the containers.

However, to build the extension, I do need to install a postgresql-server-dev package, so for example to build the extension on the 11.1 container, I docker exec'd into the container, then did something like:

apt update apt install git git clone https://github.com/zilder/parquet_fdw.git cd parquet_fdw/ apt install sudo source install_arrow.sh apt -y install postgresql-server-dev-11 apt -y install g++ apt install make make install

zilder commented 5 years ago

I setup a docker container from the link you provided, run the same steps you have and still can't reproduce the issue. I tried with pg 11.1:

                                                             version                                                              
----------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.1 (Debian 11.1-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
(1 row)

At this point it is hard to say what is wrong. It looks like some internal postgres code tries to invoke copyObject() or equal() or one of dozen walker and mutator functions on an object that isn't a postgres Node. And I can't see where it could happen. The only suspicion is ForeignPath.fdw_private which is supposed to be a Node (List actually), but in my case is just a plain C object. But as it is stated in the comment to ForeignPath it isn't touched by postgres core, so theoretically shouldn't be a problem.

zilder commented 5 years ago

If you have some experience with debugging it would be super helpful if you could attach to a postgres backend with gdb (backend's pid can be obtained by running select pg_backend_pid(); in psql) set a breakpoint to elog_start function, run the query and when breakpoint is hit run:

backtrace

in gdb.

(You'll need a postgres package with debug symbols to extract some meaningful information from it)