graphite-project / carbon

Carbon is one of the components of Graphite, and is responsible for receiving metrics over the network and writing them down to disk using a storage backend.
http://graphite.readthedocs.org/
Apache License 2.0
1.51k stars 490 forks source link

Any way to optimize CPU consumption of carbon-cache? #539

Closed pavel-odintsov closed 7 years ago

pavel-odintsov commented 8 years ago

Hello!

I've pretty graphite instance which receives about 50k data points each 3 seconds. It's not so much because my server is pretty fast.

But actually carbon-cache process consumes plenty of cpu power:

11652 _graphi+  20   0  282484 152660   4424 S 31.0  3.8   3:18.96 carbon-cache 

I could offer perf top output:

  22.00%  python2.7                          [.] PyEval_EvalFrameEx
   3.76%  python2.7                          [.] PyObject_GetAttr
   3.07%  python2.7                          [.] PyFloat_FromString
   2.13%  python2.7                          [.] PyFrame_New
   2.09%  python2.7                          [.] PyTuple_New
   1.78%  python2.7                          [.] PyDict_GetItem
   1.76%  python2.7                          [.] 0x00000000000f847d
   1.58%  python2.7                          [.] 0x000000000014eee4
   1.49%  python2.7                          [.] 0x0000000000162acf
   1.37%  python2.7                          [.] PyString_FromFormatV
   1.31%  python2.7                          [.] _Py_dg_strtod
   1.26%  python2.7                          [.] 0x000000000014ef9c

Do you have idea what's PyEval_EvalFrameEx is and how I could avoid it?

I'm using 0.9.12-3 from Ubuntu 14.04.

deniszh commented 8 years ago

It's quite a log for single python process. Python is not CPU scalable because of GIL. Try to split your load to 2,4,6.. carbon caches using carbon relay

pavel-odintsov commented 8 years ago

Thanks for answer! Will try it! but will be nice to know what's function consumes so much CPU and re-implement it in C language instead.

genisd commented 8 years ago

There are carbon(-cache|-relay)? implementations in go, perhaps that's something interesting for you. Never used them myself can't say alot about it ;-)

deniszh commented 8 years ago

Yes, you can try go-carbon too - https://github.com/lomik/go-carbon

eqinox76 commented 8 years ago

Hi, we had some success whit using pypy as interpreter instead of the default python interpreter. Our load was handled by 5 -cache instances using python. With pypy two instances were able to handle the same load.

howdoicomputer commented 7 years ago

I'd second the pypy comment - worked great for revitalizing an older cluster.