Impossible to stop/kill slow query

rethinkdb / rethinkdb

The open-source database for the realtime web.

https://rethinkdb.com

Other

26.7k stars 1.86k forks source link

Impossible to stop/kill slow query #5197

Open let4be opened 8 years ago

let4be commented 8 years ago

Let's say I do

r.db("local").table("domains").filter({domain: "odesk.com"}).limit(1)

though table has primary key on domain field it runs insanely slow I can't even kill this query it effectively locks up the whole db I tried to run the query from rethinkdb data explorer(8080) and from recli https://github.com/stiang/recli

RethinkDB 2.2.1, data set contains ~140 mil records in domains table

danielmewes commented 8 years ago

This is a limitation of how queries are currently terminated. Queries only get terminated when they generate the next (batch of) results. In this case that appears to take a very long time, since presumably the filter criteria matches very rarely?

We should explain this limitation better in the documentation. I'm going to open a separate issue for that.

Would you mind elaborating a bit on what you mean by "it effectively locks up the whole db"? Do other queries become (too) slow while this is running?

let4be commented 8 years ago

The whole database becomes unresponsive and other queries execute a lot slower + I see ~50-100k reads/second in the performance chart(I guess rethinkdb performs a full scan), And because of the data size(140 mil records) first query is literally taking hours, is there any way to forcefully stop such queries?

danielmewes commented 8 years ago

The whole database becomes unresponsive

That probably shouldn't happen. Can you check free -m to make sure that there's enough memory available and the server isn't going into swap?

I suspect it could become so slow because of i/o contention. Is the database stored on a rotational disk (rather than an SSD)?

Currently the only way of killing this query is restarting RethinkDB on the server.

danielmewes commented 8 years ago

Oh also for the future: ReQL doesn't use indexes automatically. You have to explicitly specify the index to use, or in case of a point lookup on the primary index use the get command rather than filter:

r.db("local").table("domains").get("odesk.com")

See http://rethinkdb.com/docs/secondary-indexes/javascript/ for more details.

let4be commented 8 years ago

database is on ssd, regarding memory(in the time of slow query running): total used free shared buffers cached Mem: 16048 15881 167 5 26 6590 -/+ buffers/cache: 9264 6784 Swap: 0 0 0

Thanks @danielmewes

danielmewes commented 8 years ago

Thanks for checking, that memory consumption looks fine.

I'm not quite sure why the filter query would make other queries that much slower on an SSD (a bit of slowdown is normal of course). What are the other queries you're running like? Can you also check how much CPU the server is using (via top or htop), and how much i/o is happening (e.g. iostat -d 2 -m)?

I opened https://github.com/rethinkdb/docs/issues/979 btw. to document the limitations of killing a query better.

As mentioned, I believe changing your query to use get should avoid this issue in the future (it should return really quickly with that).

dollschasingmen commented 7 years ago

+1 we have observed something very similar as well. a table scan style query looking at rare event caused the entire db to "lock up"

edit: actually it looks like it only "locks up" the table, not the entire db.