Segfault-Inc / Multicorn

Data Access Library
https://multicorn.org/
PostgreSQL License
700 stars 145 forks source link

Execute API - Fatal Python error: deallocating None #115

Closed tchornyi closed 8 years ago

tchornyi commented 9 years ago

We are using the recent multicorn checked out from the github and installed with postgresql-9.4 in Ubuntu environment.

We have custom implementation of Foreign Data Wrapper that is based on https://github.com/Mikulas/pg-es-fdw ElasticSearch FDW. We are populating PostgreSQL database with data and FDW is responsible for synchronization of them with ElasticSearch.

The pgScript for populating the database works in a loop and inserts data into some tables. After those inserts it performs some _UPDATE foreign_table SET foreign_table.col = VALUE_A WHERE ID = SOMEVALUE queries on the foreign table. Such queries call execute() method of our FDW implementation. And in this loop, always around 1180 repetitions we are facing at Fatal Python error: deallocating None.

While we were investigating the case we found out that the problem is around "execute()" call. Execution never gets out from that method according to our logs when mentioned error occurrs. We were also monitoring None reference counter and noticed that it first goes up and then falls down in a moment, and then again goes up, up and up, and then down. Execution starts when counter is set to value aorund 450 and terminates at ~ 120.

rdunklau commented 9 years ago

I think I found the culprit: could you please test with that ? https://github.com/Kozea/Multicorn/commit/c3ff2143e11a38b19e09bb8c304635d74645d4d3

Do your row have some null values ? If not, I may have fixed an unrelated bug.

tchornyi commented 9 years ago

Thank you, will do.

Seems like that yes, that row has a null value within and that value gets returned as None from the execute() method. Interestingly, each time when execute gets called it returns a row with one null value (by the logic of our program it's OK), but somehow it crahes after ~1180 repetitions.

rdunklau commented 9 years ago

Returning None values from execute is OK, and is converted to NULL. When this NULL values is converted back to None (for the update call), no increment was performed on PyNone, but a decrement was performed when we were done with this value. Hence, the reference count for PyNone decrementing slowly over time (compensating from the multiple references held to None from various part of the interpreter / libraries), causing it to eventually get deallocated.

tchornyi commented 9 years ago

Good news, your fix helped and now the problem has gone. Thank you!