DevotedMC / NameLayer

A Citadel reinforcement group manager. Allows the creation of groups in Minecraft. Built for Spigot 1.14.4
https://build.devotedmc.com/job/Namelayer-master/
BSD 3-Clause "New" or "Revised" License
3 stars 8 forks source link

Start-up hangs after starting connection pool #25

Closed Murreey closed 7 years ago

Murreey commented 7 years ago

I updated my server to the latest Spigot 1.11.2 build (git-Spigot-3fb9445-2b6c9f4) and NameLayer to 2.10.02, and now it hangs when starting the server.

It gets to this point before stopping, and nothing else loads after this. There's no error or anything.

08.05 07:50:02 [Server] INFO [NameLayer] Enabling NameLayer v2.10.0
08.05 07:50:02 [Server] INFO [NameLayer] Preparing to generate ConnectionPool for NameLayer.
08.05 07:50:02 [Server] INFO [NameLayer] Connecting to {db}@{ip}:3,306 using {username}
08.05 07:50:02 [Server] WARN [Server thread] INFO com.zaxxer.hikari.HikariDataSource - HikariPool-1 - Started.

I've tried leaving it for 20 minutes or so before I just have to force kill the server. I've tried downgrading back to the last NameLayer build we were using (2.10.0) and it now does the same thing, so I guess it's something to do with the updated Spigot build. The old Spigot we were using was from around March. If I remove NameLayer completely, it starts up fine. The MySQL DB is definitely working, and other plugins are able to access it fine.

My config:

sql:
  hostname: {ip}
  port: 3306
  dbname: {db}
  username: {username}
  password: {password}
  connection_timeout: 10000
  idle_timeout: 600000
  max_lifetime: 7200000
  poolsize: 10
groups:
  enable: false
  grouplimit: 1
  creationOnFirstJoin: true
persistance:
  forceloadnamecaching: false

Thanks.

ProgrammerDan commented 7 years ago

Can you try with a poolsize of 1 or 2, and let me know if the issue persists? Depending on your database configuration, it's possible the pool is attempting to allocate more connections then your database will permit, and is "hanging" while waiting for a connection to become available to add to the pool.

I've never experienced this myself but it's possible.

I think the default max # of connections is quite small, all considered, in mysql/mariadb. I generally increase it to 1k for my server.

Hope that helps, no promises though, this is weird.

Murreey commented 7 years ago

Getting the same result with poolsize 1 and 2.

ProgrammerDan commented 7 years ago

Just to confirm; you have latest CivModCore loaded as well?

Then, does syslog reveal any errors during this time period? E.g. at the moment the hang begins, do you see new syslog or mysql.log entries?

Murreey commented 7 years ago

Running CivModCore 1.5.11, latest from Jenkins.

Using a server host and phpMyAdmin so I can't see system logs, although checking the status logs in phpMyAdmin I found this:

There's a bunch of those all the same which I assume are from every time I started up. Seems that's what it's hanging on?

ProgrammerDan commented 7 years ago

Interesting. You have all your plugins mapped to the same virtual database?

Can you pastebin me a typical startup log (please redact if anything sensitive is printed) showing what is starting before NameLayer?

Murreey commented 7 years ago

Yeah, everything uses the same MySQL database. Here's the log ending when NameLayer stalls

It's normally the exact same as that, and there's a few plugins in there using the DB before NameLayer starts (SaneEconomy).

ProgrammerDan commented 7 years ago

Ok. So, best I can offer atm is this: http://stackoverflow.com/questions/13148630/how-do-i-find-which-transaction-is-causing-a-waiting-for-table-metadata-lock-s

Some plugin, running prior to NameLayer, seems to be grabbing an exclusive lock on that database's metadata table (which namelayer needs context to...). At the point of failure, NameLayer is being prevented from generating a table at all, which points to the problem being upstream.

Now, one potential workaround is to simply point Namelayer to a different virtual database. Using your administrative console, create a new database for Namelayer, migrate your old tables there (including db_version) -- I can get you a list if needed -- then point the config at that new database in your same database server.

I think that should bypass the lock, assuming my diagnosis-at-a-distance is correct.

Alternatively, check out that stackoverflow link and others related to "mysql hangs with 'waiting for table metadata lock'".

Let me know if any of this provides insight.

Murreey commented 7 years ago

Unfortunately I can't try the second virtual database, the way the host is set up is just with one MySQL server and a different DB for each server instance. Most of the metadata tables and the information_schema database are locked too.

However it seems the host restarted the MySQL server yesterday which they don't often do, and given that the issue is happening even with older NameLayer builds I figure something was changed with the settings that's causing the locking issue. Even when I start the server with just CivModCore and NameLayer it still happens so I don't think it's another of our plugins.

I think I might have to contact the host and see if they can either figure out what's locking the metadata table or if they can give us a second virtual database to run NameLayer from.

ProgrammerDan commented 7 years ago

Sounds good. Let me know if they advise any changes to the queries run; also, what version of mysql is your host running?

Murreey commented 7 years ago

They're running MySQL 5.5.45.

Murreey commented 7 years ago

Host figured out what was locking the metadata table, and NameLayer is working perfectly now.

Thanks for your help!

ProgrammerDan commented 7 years ago

That's great news! I'm glad you were able to move to a solution.