duckdb / pg_duckdb

DuckDB-powered Postgres for high performance apps & analytics.
MIT License
1.29k stars 48 forks source link

Harden against Postgres interactions with C++ destructors #93

Open JelteF opened 1 month ago

JelteF commented 1 month ago

While looking at the the C++ guidelines in the CONTRIBUTING.md file proposed in #90, I realized/remembered that Postgres C and regular C++ are not automatically the good friends that the project currently assumes they are.

Specifically if Postgres throws an ERROR, it will use sigsetjmp and longjump to traverse the the stack until the next PG_TRY. This is really problematic when considering the way C++ destructors work. Because the longjump will jump over these destructors. This can cause many different problems like memory leaks or locks acquired by lock_guard not being released.

The other way around is similarly bad. Any C++ exceptions that are not caught will skip over any cleanup that Postgres does in PG_CATCH. And thus probably killing the whole Postgres process tree.

This is all listed in the postgres docs as well: https://www.postgresql.org/docs/current/xfunc-c.html#EXTEND-CPP

The only way to fix these problems in a managable way I think, is by making the boundaries of calling C++ from C and C from C++ much clearer than they are now (possibly by never including both duckdb headers and postgres headers in the same files). And then at these boundaries convert C++ exceptions to Postgres errors and the other way around.

P.S. This same problem applies to Rust and was worked around by pgrx by implementing panic -> pgerror and pgerror -> panic conversion at every FFI boundary: https://github.com/pgcentralfoundation/pgrx/blob/develop/docs/src/design-decisions.md#protecting-the-rust-stack-read-lying-to-postgres

wuputah commented 2 days ago

-bug as this does not have a specific bug report to fix, but instead is an approach to improve overall error handling. That said, we have agreed that this is a good idea and we want to move towards having all interactions between C and C++ be in a single file similar to how you would use FFI -- but we won't try to catch memory allocation errors.