Closed viknet365 closed 3 years ago
Hi @viknet365 ,
Unfortunately, nor consolidateBy nor maxDataPoints not designed to help with query performance. All functions and approximations are done after fetching all data, so, it will only add more work instead of reducing it.
Hi @deniszh, thank you for your answer, so I understand that by design the full file is fetched in memory before doing calculation, so I can understand in memory calculation can add some overhead, but in this case, it's a ten time overhead (1s for 1000 points vs 10s for 10 points, and this time is the same when using 'avg' and 'first'/'last' (in this case it should only cherry pick some data and not doing calculation).
Furthermore, If I try to use whisper-fetch ( quite slow, it takes 2,2s) and awk to average the result (take 200ms to average all the line on a the same computer) :
root@nuc:/var/lib/docker/volumes/graphite-storage/_data/whisper/openhab# time awk '{ total += $2; count++ } END { print total/count }' ./consommeHP_Fours.txt
691537
real 0m0,274s
user 0m0,256s
sys 0m0,017s
and we all agree that both tool are not designed for performance !!!! There is maybe some good reason for such low performance in my use case (low CPU, low memory use of docker or a combination of them ), but I am quite puzzled....
behaviour is the same with graphite 1.0.2-3 (4 years old) so the behaviour has not changed for many years (so its probably not a bug). still, very surprised with this. After trial and error I have set many of my grafana maxdatapoints to 2500. I remove the BUG tag but If anyone has time to explaining this behaviour I will be happy.
I guess the consolidate function is not written with performance in mind. See https://github.com/graphite-project/graphite-web/blob/master/webapp/graphite/render/datalib.py#L100-L135 . The list operations could be done faster by just summing. Anyway, if you really care about performance you'd better check out the go-graphite stack.
thank for the answer and advice, I realy like golang but I was a bit skeptical about go-graphite and specially I was not sure it implement all graphite function (there are tons and keeping two project like this synchronized is not easy) . I will try and give performance results I obtain. In the meantime I tried hacking datalib.py to see where is the bottleneck. I first remove the use of the intermediate use of the buffer buf[] and end up changing :
def __iter__(self):
if self.valuesPerPoint > 1:
return self.__consolidatingGenerator( list.__iter__(self) )
else:
return list.__iter__(self)
by this
def __iter__(self):
return list.__iter__(self)
completely removing the use of the __consolidatingGenerator function (we all understand it won't give meaningfull data but it was just a test) and the performance result are roughly the same (less than a 2s gain ), so it seems the bottleneck is not in these two functions
same hardware, same docker, same wsp file but using go-carbon instead of graphite-web:
root@nuc:~$ time wget --post-data="target=openhab.consommeHP_Fours&from=-150d&until=now&format=json&maxDataPoints=10&" http://192.168.2.200:81/render
--2021-05-25 16:24:50-- http://192.168.2.200:81/render
Connexion à 192.168.2.200:81… connecté.
requête HTTP transmise, en attente de la réponse… 200 OK
Taille : 373 [application/json]
real 0m0,011s
user 0m0,007s
sys 0m0,003s
root@nuc:~$ time wget --post-data="target=openhab.consommeHP_Fours&from=-150d&until=now&format=json&maxDataPoints=100&" http://192.168.2.200:81/render
--2021-05-25 16:26:48-- http://192.168.2.200:81/render
Connexion à 192.168.2.200:81… connecté.
requête HTTP transmise, en attente de la réponse… 200 OK
Taille : non indiqué [application/json]
real 0m0,102s
user 0m0,001s
sys 0m0,009s
root@nuc:~$ time wget --post-data="target=openhab.consommeHP_Fours&from=-150d&until=now&format=json&maxDataPoints=1000&" http://192.168.2.200:81/render
--2021-05-25 16:26:55-- http://192.168.2.200:81/render
Connexion à 192.168.2.200:81… connecté.
requête HTTP transmise, en attente de la réponse… 200 OK
Taille : non indiqué [application/json]
real 0m0,101s
user 0m0,004s
sys 0m0,005s
I will see how it behave in real life use........
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Describe the bug When querying a graphite metric over 150d, I was expecting to use the consolidateBy function and the maxdatapoints options to speed up the query: Both approach failed
To Reproduce
When decreasing the datapoints number the request take longer. When trying to leverage this performance (using the consolidatebyfunction last ) the results are equivalent:
Expected behavior 1) I was expecting reducing the maxdatapoint would reduce the query time (I can understand this one because graphite has to calculate all the average, but it is surprising) 2) I was expecting the consolidateBy function with 'last' or 'first' to be quicker than the default 'avg' 3) I was expecting faster queries on this database.
Screenshots If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Additional context I am using the following retention :
and the wsp file info return this: