Closed osawyerr closed 1 year ago
Can you share the _PG_init()
function? Or can you link to your repo so we can take a look?
Do we need to pass --test-threads=1
to cargo test
in cargo pgx test
?
I'd expect this to be an issue, at least (but TBH, I am unsure why I didn't see it in #777 if it is an issue).
Naw. cargo pgx test
is supposed to support multiple threads.
My hunch is that this extension is doing something with Postgres when every session starts that should only ever be happening once. Which might mean that using session_preload_libraries
just isn't correct. It might also point out some kind of issue with pgx itself -- pgx ought not be crashing anyways.
As @osawyerr pointed out over in Discord, ZDB does this and it doesn't have these problems. It also doesn't do anything fancy during _PG_init()
. So I'd like to confirm that's the case here.
A backtrace from the postgres session that core dumped would be useful too. It appears the backtrace above is from the outer "cargo test" controller process. You can convince MacOS to just write them to /cores/.
I got a couple myself that have built up over time!
du -ksh /cores/
147G /cores/
EDIT: Each"cargo test" thread gets its own connection to Postgres, which is where the extension is being loaded/run. So the test framework is concurrent, but it's ultimately through multiple Postgres processes, not threads.
Each"cargo test" thread gets its own connection to Postgres, which is where the extension is being loaded/run. So the test framework is concurrent, but it's ultimately through multiple Postgres processes, not threads
Thanks, that makes sense. I suspected that might have been the case (or else I'd have hit this issue with my thread safety checks), but figured I'd mention it regardless.
@eeeebbbbrrrr _PG_init() is empty.
I think I can consistently reproduce this now. I added the repo here - https://github.com/osawyerr/foo
MacOS on M1 processor PGX - 0.5.6 Postgres 14.5
Incredible.
I can recreate it locally, so that's good (I suppose!). It kinda smells like MacOS isn't reloading the shared library or something. I dunno.
It's also annoying that this is gonna be hard to debug.
Thanks for the report, I'll look into it this week.
I just tested on my AMD Linux machine and this behaves as we'd expect. So it's gotta be MacOS-related. Joy!
I think @thomcc is gonna dig into this.
This is likely related to the cases I described in https://github.com/rust-lang/rust/issues/88737#issuecomment-1178525208, so yeah, I'm probably in a good position to look into it. Worst case we force the dylib to have a distinct name.
Oh, we already handle this in plrust https://github.com/tcdi/plrust/blob/c1be4dfd5517e466ba3e61c01fc56f8072d21ef0/src/generation.rs#L5-L23. Yeah, we should just do the same thing as there. I think there are probably hacks we could do to get around it (involving TLS dtors), but if we use a distinct name there should be no issue.
@thomcc: "I am pretty sure (this issue) is mostly unrelated to the loader. The actual cause is probably a bug in our unwinding code.
Currently, if we panic in a pg_extern
function that is called from inside a pg_test
function, then we segfault. this seems to be because libtest ends up dropping some Box twice, although it may be unrelated heap corruption that makes it look that way. Either way, if we exit via a segfault, we don't end up cleaning up after ourselves, and we don't detect the partially messed-up state the next time through. If we get lucky and don't crash, then there's no issue where the function fails to be reloaded.
I think libtest dropping a box twice sounds likely to mean there's an edge-case where we end up returning twice from the same rust function (or something like this)."
This happens on MacOS on M1 processor PGX - 0.5.6 _bootstap.sql file is based on a similar file from ZomboDB project.
Restarting my machine fixes the issue and test is successful. There was a similar error that was reported sometime back - https://github.com/tcdi/pgx/issues/357. I'm not sure if its a similar issue here.
To recreate:
cargo pgx new foo
cargo pgx test
Add the following in _bootstrap.sql file in sql folder
extension_sql_file!("../sql/_bootstrap.sql", bootstrap);
cargo pgx test
Error
Not sure how useful this is but I was able to capture a stack trace in my IDE. Message was "Postgres failed to start"