Closed pkkummermo closed 8 years ago
You have neglected to tell us the actual queries you are running. Please include details of those.
You should also be aware that we are actively working on the query-engine performance and improvements can be expected over the next few weeks.
@pkkummermo If sales
is a measurement then thats very high series cardinality. Just some back of the envelope math ( 30 * 10,000 * 160 * 10 ) suggests around 480,000,000 possible series in that measurement. I would suggest lowering the number of possible tags and separating the data over two or more measurements.
Our testing indicates that best performance for your hardware set up is 100k series per measurement. Starting to get up to 1M series begins to seriously slow down both queries and writes.
Hi! The queries are simple sum(v1) where time {{today}} group by time(1h). I suspected the tags being a the problem, but shouldn't 10k in terms of indexing be a piece of cake? I think there's a line in the doc that tags shouldn't exceed 100k, so I thought 10k was well within the "expected" limits.
Can you show us the exact queries?
Ah, so it's permutations. That would explain the behaviour. Thanks for the tip. I can provide exact queries tomorrow if they are of interest.
@pkkummermo please do share the queries, and if you can, the log statements showing the execution time. For example:
[query] 2015/11/04 13:07:55 SELECT mean(value) FROM "telegraf"."default".cpu_usage_guest WHERE time > now() - 1d GROUP BY time(1h)
[http] 2015/11/04 13:07:55 ::1 - - [04/Nov/2015:13:07:55 -0800] GET /query?db=telegraf&q=SELECT+MEAN%28value%29+FROM+cpu_usage_guest+WHERE+time+%3E+now%28%29+-+1d+GROUP+BY+time%281h%29 HTTP/1.1 200 197 - InfluxDBShell/0.9.4.1 1e45aab6-8338-11e5-8148-000000000000 1.737655ms
The query just before the crash is as followed:
[tsm1wal] 2015/11/05 13:02:34 /opt/influxdb_data/incomestats/default/26 flush to index took 815.24874ms
[http] 2015/11/05 13:02:35 10.248.149.169 - root [05/Nov/2015:13:02:28 +0100] GET /query?db=incomestats&epoch=ms&q=SELECT+sum%28%22value%22%29+FROM+%22income_initial_price_10m%22+WHERE+%22client%22+%3D+%271999%27+AND+time+%3E+1446678000s+and+time+%3C+now%28%29 HTTP/1.1 200 122 http://heimdall:3000/dashboard/db/incomedash Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.101 Chrome/45.0.2454.101 Safari/537.36 160a3c6b-83b5-11e5-8a6b-000000000000 6.468341299s
[query] 2015/11/05 13:02:35 SELECT initialPrice, monthlyPrice FROM "incomestats"."default".sales WHERE time > now() - 30s
[http] 2015/11/05 13:02:36 10.248.149.169 - root [05/Nov/2015:13:02:28 +0100] GET /query?db=incomestats&epoch=ms&q=SELECT+count%28%22value%22%29+FROM+%22income_initial_price_10m%22+WHERE+time+%3E+1446678000s+and+time+%3C+now%28%29 HTTP/1.1 200 122 http://heimdall:3000/dashboard/db/incomedash Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.101 Chrome/45.0.2454.101 Safari/537.36 161eeb7b-83b5-11e5-8a6d-000000000000 7.303513539s
[http] 2015/11/05 13:02:36 10.248.149.169 - root [05/Nov/2015:13:02:28 +0100] GET /query?db=incomestats&epoch=ms&q=SELECT+sum%28%22value%22%29+AS+%22value%22+FROM+%22income_initial_price_10m%22+WHERE+time+%3E+1446678000s+and+time+%3C+now%28%29+GROUP+BY+time%281d%29 HTTP/1.1 200 130 http://heimdall:3000/dashboard/db/incomedash Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.101 Chrome/45.0.2454.101 Safari/537.36 1609a814-83b5-11e5-8a67-000000000000 7.537070515s
[http] 2015/11/05 13:02:36 10.248.149.169 - root [05/Nov/2015:13:02:28 +0100] GET /query?db=incomestats&epoch=ms&q=SELECT+last%28%22initialPrice%22%29+FROM+%22sales%22+WHERE+time+%3E+1446678000s+and+time+%3C+now%28%29 HTTP/1.1 200 110 http://heimdall:3000/dashboard/db/incomedash Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.101 Chrome/45.0.2454.101 Safari/537.36 160a33e6-83b5-11e5-8a69-000000000000 7.571324992s
[http] 2015/11/05 13:02:36 10.248.149.169 - root [05/Nov/2015:13:02:28 +0100] GET /query?db=incomestats&epoch=ms&q=SELECT+sum%28%22value%22%29+AS+%22Startpris%22+FROM+%22income_initial_price_10m%22+WHERE+time+%3E+1446678000s+and+time+%3C+now%28%29+GROUP+BY+time%281h%29 HTTP/1.1 200 177 http://heimdall:3000/dashboard/db/incomedash Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.101 Chrome/45.0.2454.101 Safari/537.36 1609a817-83b5-11e5-8a68-000000000000 7.660962265s
[http] 2015/11/05 13:02:36 10.248.149.169 - root [05/Nov/2015:13:02:28 +0100] GET /query?db=incomestats&epoch=ms&q=SELECT+count%28%22initialPrice%22%29+AS+%22initialPrice%22+FROM+%22sales%22+WHERE+time+%3E+1446678000s+and+time+%3C+now%28%29 HTTP/1.1 200 114 http://heimdall:3000/dashboard/db/incomedash Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.101 Chrome/45.0.2454.101 Safari/537.36 163396d5-83b5-11e5-8a6e-000000000000 7.50127786s
@pkkummermo since you are on tsm1
and have ~300MB of data, that translates to roughly 100,000,000 field values, assuming you aren't storing any strings. A SELECT COUNT()
over 100 million points is going to take a while.
That said, it seems odd that the CPUs would be pegged as a COUNT is almost exclusively I/O bound. You mentioned "using Grafana to visualize by asking for data within the current day every 10s." I strongly recommend setting up Continuous Queries to downsample the data into a new measurement and then graph that in Grafana. That way you are pulling 100-1000 points from disk for each dashboard query, instead of 100k points.
That sounds abit extreme, as we have v1,v2,v3,v4 per database entry with perhaps 120k entries total. The select count(*) is also limited by time, which would mean it didn't query all the data. I tried creating both a 10m continuous query as well as a 1h, but the performance improvement was miniscule.
My best guess is that the indexing of the tag set is my problem, especially if it's reindexing per insert and tag indexes are permutations between the available tags. The InfluxDB docs should have a section describing "best practices" for database design so other people doesn't run into the same problem as me :)
@pkkummermo totally agreed on the better docs for schema and performance. I did miss the time restrictions in your queries (I'm having a hard time reading queries today for some reason).
Hi!
I've installed the nightly (now running at version 0.9.5-nightly-6682752) and I'm having huge performance issues which ultimately leads to a crash. I've tried to upgrade the nightly three times (the latest being version 0.9.5-nightly-6682752), just to see if there was a known bug which has been fixed.
I have the following schema sales:
We're using the 2.0 of java-influxdb to report to the server, with maybe 1-2 inserts every 4 s and using Grafana to visualize by asking for data within the current day every 10s.
At first everything went fine, but after a while (24h) I found that the queries who took 18ms now took 200ms. Even later (next day) the queries took 5s(!). All of our data is approx 300MB in size. The server which is a local blade which has 64GB ram with 48c were running at ~100% on every core. Storage consist of 6 SSDs in RAID.
Is it something obvious I'm doing wrong? Is the schemas wrong? Is there a configuration which I've totally missed?
PKK
Configuration file: