claudehohl / Stikked

An advanced and beautiful pastebin written in PHP
992 stars 220 forks source link

Use mysql match() against rather than search %string% #535

Closed PJanisio closed 4 months ago

PJanisio commented 4 years ago

Hi again guys,

Looking at my mysql-slow-queries log I found resource killing sql syntaxes that we have in stikked. If you are running pastebin with a lot of pastes stored in database sql queries can take several seconds as in example from my log:

SELECT id FROM pastes WHERE private = 0 AND (title LIKE '%:8080%' OR raw LIKE '%:8080%');
# Time: 200120 17:57:55
# User@Host: blablabla[blablabla] @ localhost []
# Thread_id: 27473  Schema: database_name  QC_hit: No
# Query_time: **10.599777**  Lock_time: 0.000115  Rows_sent: 30  Rows_examined: 34391
# Rows_affected: 0  Bytes_sent: 15051396

There is an function to do match againts and it is most common used (didnt tested yet). Anyway, worth to test, when I get some free time I will submit results (or maybe PR) if got fired from work :dancer:

The-Compiler commented 4 years ago

I'd definitely appreciate some kind of proper full-text search, I have a pastebin where searches take some 15-20 seconds currently.

Note that that'd probably mean that sqlite support would need to be dropped (or at least would get more complex) though.

PJanisio commented 4 years ago

BTW i changed SQL engine from myisam to innodb for pastes table. That is helping also a lot.

divinity76 commented 4 years ago

is match() supposed to be significantly faster than LIKE ? i kinda doubt it, IMO a competent query optimizer should be able to make them roughly equally performant

btw for SQLite users, the FTS5 extension should give a huge performance increase

.. also Google comes to mind

PJanisio commented 3 years ago

Im not 100% sure it will help, but seems like it could be a good step forward. My pastebin has actually around 50-100+ active users per second and Im optimizing / deleting some features to not get out of memory on my current VPS with 8GB of RAM where stikked plays the biggest part of the memory usage cake :)

BTW: do not try query cache :) it will kill your disk space / generate high load if you have huge traffic.

divinity76 commented 3 years ago

anyone done any LIKE vs match() benchmarks yet? another thing i'd love to see is how SQLite's LIKE compares to MySQL's LIKE, with both PRAGMA mmap_size=268435456; and PRAGMA mmap_size=0;

anyone here having LIKE performance issues with a database you can share? i want to do some benchmarks myself on a real database (as long as the db dump is <3TB at least; the biggest SSD i have to play with atm is ~3.5TB free)

@PJanisio about query cache, kill it with fire. disable it. it's so bad that MySQL completely removed support for query cache in version 8.0, and from personal experience, optimizing an application to play nice with query cache isn't worth the effort.

PJanisio commented 4 months ago

No further progress on this topic. Lets close it.