scripting / Scripting-News

I'm starting to use GitHub for work on my blog. Why not? It's got good communication and collaboration tools. Why not hook it up to a blog?
121 stars 10 forks source link

ER_CON_COUNT_ERROR #246

Open scripting opened 1 year ago

scripting commented 1 year ago

I've got a problem with an app that uses the MySql package in Node. I'm getting a ER_CON_COUNT_ERROR pretty close to startup. I'm taking the message at face value but it's hard for me to believe it's true that my app is making too many connections because I just doubled the max connections from 100 to 200 and still got the error on the very first query it tried to make. The problem goes away when I reboot the machine, but only for a little while, eventually it starts doing this again. This is in an app that has been running without these kinds of errors for months. I haven't tried running the app yet on another machine, but I'm beginning to think this is a not-resolvable issue. I thought I'd pass it by the SQL gurus that read my blog for ideas on how to debug this.

mterenzio commented 1 year ago

I'm not sure what you mean by "not-resolvable." I would think e everything is resolvable. I'm much more of a Posthres guy, but but I would also take that error at face value. I'd look at the connection itself and ask, "Am I using pooled connections or spawning a new connection every time you need to make a query, in response to some external factor?" The fact that you added more connections is odd, but wouldn't be a good fix for certain causes (e.g. a service that was affected by external factors and was recently discovered by an abusive actor) Or if you hit a memory limit causing a previously working query to slow to crawl and not close it's connection and the connections are just piling up. The fact that it goes away on reboot kinda means they are piling up somehow I know this isn't much but logical thinking, but without more context it's hard to know specific reasons causing it.

scripting commented 1 year ago

@mterenzio, thanks for chiming in. :smile:

I suppose in some theoretical way every problem is resolvable. But is it worth the time? Or is it a practical use of time. I'm still a relative newbie to using SQL databases. I wasn't supposed to be working on this now, I have other irons in the fire.

So I might try running the same code on a different Mac. Or I might try deleting and re-creating the database this app is using to see if either of those things make the problem go away.

If this were 20 years ago and this was the Mac heap instead of the SQL connection pool, I'd look to see what was in the blocks that were involved in the memory leak. Analogously I'd love to know what the queries are in the pool that are not finishing in time. I have no idea how to find those.

I'm also using a hosted database on Digital Ocean for this application. They've been running perfectly, but it has occurred to me the problem is on their side of things, and not here.

Again this is code that has been running without problems with a much larger data set for months.

scripting commented 1 year ago

OK, more data.

I think I have proof that the problem is not on my system, rather it's on the database side.

  1. I opened another Mac on my LAN through Screen Sharing, and launched Terminal.
  2. I tried to open the database using the mysql command, and got the ERROR 1040 (08004): Too many connections message.
  3. None of my software has run on this machine.
scripting commented 1 year ago

Next up, I'm going to try deleting the database and starting over.

But hold on, how am I even going to do that if I can't use mysql at the command line?

Something really bad is going on here...

scripting commented 1 year ago

@danderson3 -- please stop posting links to docs.

scripting commented 1 year ago

I scrapped the database and recreated it. And for a while things were running well. Now I'm getting consistent errors, even remaining after I quit the program.

Anyway I recorded the story of what's going on here, verbally.

[xxx](http://scripting.com/2023/01/09/ER_CON_COUNT_ERROR.m4a)

Here's a screen shot that goes with it.

image
scripting commented 1 year ago

One theory I hadn't thought of.

The problems are only on a Mac running on my network at home.

The servers running at Digital Ocean work fine.

I imagine the queries run faster at DO.

Esp since the SQL server is also managed by DO. I guess they would optimize for that.

So it's possible things back up on the Mac where they fly through on the cloud system.

ideoplex commented 1 year ago

It’s been a while since I did this, but did you try observing the connections from MySql? Looks like “show processlist” should do it.

scripting commented 1 year ago

@ideoplex -- I've been waiting for something like that.

I wish they had dates on these items. Some of them are from apps I haven't run on my desktop for YEARS.

Is there a way to see the queries that started these processes?

And is there a way to kill all of them?

And that answers the key question in my voice message. Yes, the queries do stick around long after the app was quit and they last through system restarts.

processlist.txt

scripting commented 1 year ago

I've managed to whittle it down to 11 processes remaining.

scripting commented 1 year ago

More data here.

https://stackoverflow.com/questions/2407732/mysql-proccesslist-filled-with-sleep-entries-leading-to-too-many-connections

I checked on the server in the cloud and its processlist has 30 rows.

jystervinou commented 1 year ago

Sorry just a quick note.

In the past I used MyTop which was great to see which queries were running on the MySQL side.

But it looks abandoned.

Maybe someone knows some equivalent tool, easy to install and run on macOS?

https://www.digitalocean.com/community/tutorials/how-to-use-mytop-to-monitor-mysql-performance

It was written by Jeremy Zawodny.

http://jeremy.zawodny.com/mysql/mytop/

jystervinou commented 1 year ago

Just listened the recording.

i would look at the node code that makes these queries and verify that the code that releases/closes connections or releases connections from the pool is reached, the async execution of Node can be tricky there. (Checking the node MySQL package doc for how to properly release connections when the query is over)

maybe the initial run runs a bunch of queries at startup and you reach 100/250 connections quickly ( at Digital Ocean that might have another higher limit, and faster servers that execute the queries faster), then some of them terminate and leave place for the next query runs in the next minutes when you say it then runs normally

I'd say there must be a mix of lots of normal (and maybe slow) queries, at startup, and some queries not closed at all in the node code. (The ones with Sleep status)

jystervinou commented 1 year ago

As for why it used to work, maybe the number of feeds increased and so the number of queries at startup.

jystervinou commented 1 year ago

We can't see, in the process list log, which queries were running before those connections went sleeping (the running sql is in the Info column when running)

that's when MyTop is useful : you can see, live, updated, show process info, and try to spot which queries finish and keep their connections open.

Note: the sleeping connections might be the ones kept open by the pool..

jystervinou commented 1 year ago

Another great debugging help: the MySQL query log. It needs to be activated, then you have in this file all the queries that go threw MySQL

https://blacksaildivision.com/mysql-query-log

you could count how many are actually being sent to MySQL when you launch the app.

(that's logging on the MySQL side, not on the node/client side) so you get the exact queries sent to MySQL)

One must not forget to turn it off after debugging.

jystervinou commented 1 year ago

How are you updating max_connections?

scripting commented 1 year ago

@jystervinou -- thanks for all the questions and ideas.

I think I now understand what's going on and what I need to do to get beyond this bottleneck.

I use the product on my Mac because that's how I can use a debugger to set breakpoints and be sure that it's working the way I designed it to work. So I turned off the features in the local version that are making the connections go up. I'm also going to cache some of the frequent operations so we don't hit the database at all (for example to validate a user's identity, there aren't that many users relative to the number of feeds or items. But the identity info is checked on every request.)

What's weird is that I just have one entirely UI-based feature left to implement in the project I'm working on, so I'm not really depending on the database for this project, but it made itself an issue.

JY it would be very helpful if you could get in the loop on FeedLand. You helped me get up the learning curve on SQL in the first place, you should see how I'm using it now, I think you'd be proud. ;-)

scripting commented 1 year ago

To close the loop, I had an incredibly productive day, with the problems of the previous two days behind me.

Thanks for all the help esp @ideoplex and @jystervinou.