Segfault-Inc / Multicorn

Data Access Library
https://multicorn.org/
PostgreSQL License
699 stars 145 forks source link

Multicorn and Parallel Workers (in 9.6.3) sometimes cause Seg Fault 11 crashes #192

Open rotten opened 6 years ago

rotten commented 6 years ago

Reference this thread in the pgsql-bugs mailing list:

https://www.postgresql.org/message-id/flat/CAMAYy4LaWA8nSWxpgaMp4F-E0uoB2PAstPXidVbQdhq-T7Fagw%40mail.gmail.com#CAMAYy4LaWA8nSWxpgaMp4F-E0uoB2PAstPXidVbQdhq-T7Fagw@mail.gmail.com

It appears that if you have parallel workers enabled, and you have any Multicorn Foreign Tables in your database, even if you aren't querying them, your database may crash.

It doesn't sound like the PostgreSQL development team is in a hurry to fix this, although they have some interesting ideas in that direction, perhaps for implementation in PG 11. My impression is that there may be something the Multicorn development team can do to help minimize the risk of this happening. I certainly didn't understand what they were suggesting be done.

Since PG 10 has even more parallel capabilities, I'm worried this will only get worse.

I'm concerned there hasn't been much development on Multicorn for a long time. I'm hoping I don't have to migrate to another FDW framework in order to be able to use parallel sequence scans again. Single threaded queries are significantly slower - having to keep parallel workers disabled is crippling my production system.

LuciferSam86 commented 6 years ago

Seems pretty serious. Any news on this bug? Are there any FDW frameworks to use?

rotten commented 6 years ago

The PostgreSQL team figured out a quick hack that significantly reduces the occurrences of this crash and released it in a point release in September. However it still happens - once every 2 weeks instead of every day. They said they'd have a permanent fix in PostgreSQL 11, which comes out in October 2018. I don't know if PG 10 is any better, but I'm hoping so, and will bump our systems up as soon as I can to see. I'm not certain the crashes after the point release are exactly the same thing - but on the surface they look like it. I figured I'd drill into it deeper again after we get upgraded to pg 10. Our architecture is heavily dependent on FDW's, we would be really handicapped without them.