EnterpriseDB / mysql_fdw

PostgreSQL foreign data wrapper for MySQL
Other
521 stars 160 forks source link

Mysql server has gone away (mysql_fdw 2.5.5 in pg 13) #215

Closed Magmatrix closed 2 years ago

Magmatrix commented 3 years ago

Hi,

Every now and then i get this on stderr:

ERROR: failed to prepare the MySQL query:
MySQL server has gone away

If i try again, it always works.

This is messing up error handling b/c I have scripts triggering on stderr output. It also makes it hard to use ON_ERROR_STOP.

Is there some option for configuring timeout/keepalive/automatic reconnection? I couldn't find anything in the docs.

surajkharage19 commented 3 years ago

Hi @Magmatrix,

Is there some option for configuring timeout/keepalive/automatic reconnection? I couldn't find anything in the docs.

Yes, you can set the timeout to a higher value in MySQL server. https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_interactive_timeout

Also, we will make a note of the reconnection facility as a feature in mysql_fdw and we will try to implement that in the future.

Magmatrix commented 3 years ago

Hi, thanks for the answer.

I checked the interactive_timeout and it is set to 8 hours. The strange thing is that this query fails semi-predictably at a specific time in the night, and it actually doesn't seem to be related to the amount of time passed since the previous query.

This is what happens: a) The foreign server is queried every 5 minutes during the day, until 9 pm. b) Then at midnight a long running job starts and it performs some other queries around 1am and 1:30am, against the same foreign server but on other tables in the same schema/db. c) Then around 2am it runs the same query as (a), but only once <-- Here is where it fails 3-4 times per week d) When immediately re-running the query, it succeeds in 100% of cases

Checked all router/firewall/switch configurations and found nothing that would affect this.

Any ideas? Can mysql_fdw be configured to silently reconnect (try once) if connection is lost?

surajkharage19 commented 3 years ago

Hi @Magmatrix

b) Then at midnight a long running job starts and it performs some other queries around 1am and 1:30am, against the same foreign server but on other tables in the same schema/db. c) Then around 2am it runs the same query as (a), but only once <-- Here is where it fails 3-4 times per week

If we are executing the queries against the same foreign server but on other tables in the same schema/db then in that case mysql_fdw uses the same connection. To me, it looks like an issue from MySQL side and you might need to further troubleshoot on the same line. Here is a link that can be helpful for troubleshooting: https://dev.mysql.com/doc/refman/8.0/en/gone-away.html mysql_fdw is not setting any timeout while connecting.

Can mysql_fdw be configured to silently reconnect (try once) if connection is lost?

Currently, we don't have any option from mysql_fdw side for reconnection, however, we will consider implementing that in the future.

cstork commented 3 years ago

Hi @surajkharage19,

I have essentially the same issue as @Magmatrix but with a MySQL Server 5.7.25-0ubuntu0.16.04.2 (and also from mysql_fdw 2.5.5 in pg 13).

Sometimes after a long pause without any queries, the first query triggers this error message.

surajkharage19 commented 3 years ago

Hi @cstork,

Sometimes after a long pause without any queries, the first query triggers this error message.

As I mentioned above, this might be happening because of the timeout settings on the MySQL server. Currently, mysql_fdw is unable to do reconnection in case of existing connection is dropped. We will surely try to implement such enhancement in the future.

surajkharage19 commented 2 years ago

Hi @Magmatrix and @cstork,

We have added the automatic reconnection logic and 'reconnect' option at the server level to control (on/off) that functionality under commit 011c499507d3ad5d8985542348625dbb7ee31ac3. Can you please check if that is useful at your end and share your feedback?

surajkharage19 commented 2 years ago

Hi @cstork and @Magmatrix,

We have not received further updates from you on this. Hoping that issue is resolved. We are closing the case for now. If the issue persists, you can reopen the case anytime.