cydrobolt / polr

:aerial_tramway: A modern, powerful, and robust URL shortener
https://polrproject.org
GNU General Public License v2.0
4.98k stars 888 forks source link

Polr hangs when there are hundreds of thousands of links in DB, and consumes all server RAM #563

Open urbaez22 opened 4 years ago

urbaez22 commented 4 years ago

Polr is having a weird behavior in our production environment, it's throwing timeout when shorten action is called from either API or web interface. The server in which is running got out of memory and there are tens of thousands of queries to DB (according to NewRelic), with each request lasting several minutes (after throwing timeout).

Expected Behavior

Just shorten the link regardless of the amount of links already shortened in DB.

Current Behavior

The server in which is running got out of memory and there are tens of thousands of queries to DB (according to NewRelic), with each request lasting several minutes (after throwing timeout).

Possible Solution

After logging been enabled, we could find some queries to DB lasting too much (more than one second), so we created an index in table links for is_custom: CREATE INDEX links_is_custom_index ON links(is_custom); And changed a PHP instruction to execute a raw-query instead of a Eloquent-generated one (as before). New raw query: SELECT * FROM links WHERE id = (SELECT max(id) FROM links WHERE is_custom = ?); And query started to last less than 10ms per call. Nevertheless, the problem arose after some days, and we found there's a mechanism in code used to ensure not to pick an already created ending to short-link (a while loop). There we found MySQL was returning results when looking for a just-generated ending because it was case-insensitive in WHERE clause. So we altered the table to store the short_url as binary, and that resolved the problem.

Steps to Reproduce (for bugs)

  1. You have to have hundreds of thousands of links in DB, so you can see the MySQL issue when ensuring the just-generated short ending has not been used yet.

Context

The issue affected operation in production because Polr is sharing the same server with a Tomcat which hosts several critical Java web applications.

Your Environment

cydrobolt commented 3 years ago

Thanks for reporting this, I'm not sure why the case-insensitivity default in MySQL is causing your server to hang, but you are right that it definitely should default to case-sensitive search. We should perform a BINARY comparison so that the WHERE clause is case sensitive. Would you be willing to elaborate on why the case insensitive issue was causing the loop to hang?