criteo / biggraphite

Simple Scalable Time Series Database
Apache License 2.0
130 stars 36 forks source link

Carbonlink malfunctioning due to start_step #503

Closed biox closed 5 years ago

biox commented 5 years ago

See: https://github.com/criteo/biggraphite/blob/e1ad5c0bb3612f88a0485bfd0a7a88b7d66e891d/biggraphite/plugins/graphite.py#L195

Carbonlink queries were failing on my Graphite installation with no results. Cached results were not being merged with cassandra results properly.

Example debug output (stock graphite 1.1.5 + biggraphite 0.14.8 install):

####### CACHED POINTS #######
2019-01-25,16:58:19.618 :: [(1548457093, 561534.353565), (1548456673, 563509.619721), (1548457063, 561246.518679), (1548456643, 473922.785076), (1548456613, 562445.92278), (1548456583, 598263.299423), (1548456553, 559780.203378), (1548457003, 477243.585225), (1548456973, 622545.077276), (1548456943, 601735.654326), (1548456913, 587953.255013), (1548456883, 575138.332545), (1548456853, 473907.994207), (1548457033, 564472.531896), (1548456823, 596697.096736), (1548456793, 553094.528122), (1548456763, 479594.943907), (1548456733, 589657.623537), (1548456703, 617799.501142)]

####### PRE-MERGE POINTS #######
2019-01-25,16:58:19.618 :: [571492.910615, 515965.442193, 538876.651253, 574834.004966, 555577.538094, 573913.095091, 600777.729518, 555301.363156, 475642.744317, 579449.631994, 614931.237084, 562061.834886, 561214.444658, 565007.372402, 567487.236456, 512165.908196, 584045.390813, 581256.516553, 583784.408046, 572360.126764, 578463.664944, 593646.427695, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]

####### MERGED POINTS #######
2019-01-25,16:58:19.619 :: [571492.910615, 515965.442193, 538876.651253, 574834.004966, 555577.538094, 573913.095091, 600777.729518, 555301.363156, 475642.744317, 579449.631994, 614931.237084, 562061.834886, 561214.444658, 565007.372402, 567487.236456, 512165.908196, 584045.390813, 581256.516553, 583784.408046, 572360.126764, 578463.664944, 593646.427695, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]

You can see above that there were several Cached datapoints ready to be merged with "actual" datapoints. However, postmerge, no cached points were ever merged. This is due to passing an incorrect start value to the merge_cached_points function in graphite.py.

After some debugging, I found the following error:

DEBUG: start_time: 51615267
Traceback (most recent call last):
  File "/opt/graphite/webapp/graphite/readers/utils.py", line 106, in merge_with_cache
    values[i] = value
IndexError: list assignment index out of range

After I adjusted the value from start_step to start_time, everything started working:

2019-01-25,18:32:02.876 :: CACHED POINTS:
2019-01-25,18:32:02.876 :: [(1548462215.921823, 0), (1548462605.894181, 0), (1548462095.914606, 0), (1548462485.886325, 0), (1548461975.906017, 0), (1548462365.921638, 0), (1548462245.879918, 0), (1548462635.883766, 263), (1548462125.932394, 0), (1548462515.885838, 0), (1548462005.930301, 4), (1548462395.91278, 0), (1548462275.919494, 0), (1548462666.03804, 148), (1548462155.878494, 70), (1548462545.902362, 0), (1548462035.917744, 0), (1548462425.87636, 74), (1548462305.922018, 0), (1548462695.922862, 0), (1548462185.92294, 0), (1548462575.89962, 0), (1548462065.886928, 0), (1548462455.908451, 70), (1548462335.909947, 0)]
2019-01-25,18:32:02.876 :: PRE-MERGE POINTS:
2019-01-25,18:32:02.876 :: [None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
2019-01-25,18:32:02.930 :: MERGED POINTS:
2019-01-25,18:32:02.931 :: [0, 70, 0, 0, 0, 0, 0, 0, 0, 0, 74, 70, 0, 0, 0, 0, 0, 263, 148, 0, None]

Now cached points are being merged properly!

This seems like a bug to me, and I'd be happy to submit a PR - either that or I don't understand the purpose of start_step.

Best,

biox

adriengentil commented 5 years ago

Hello,

I just had a quick look and it effectively seems that _merge_cached_points expects start_time instead of start_step. If you can open a PR I would be glad to approve it !

Thanks, Adrien