graphite-project / carbon

Carbon is one of the components of Graphite, and is responsible for receiving metrics over the network and writing them down to disk using a storage backend.

http://graphite.readthedocs.org/

Apache License 2.0

1.5k stars 490 forks source link

Add locking in cache.py #825

Closed piotr1212 closed 5 years ago

piotr1212 commented 5 years ago

fixes: #822 #815

piotr1212 commented 5 years ago

I've run the benchmark but getting inconsistent results on every run, so can't say for sure if this doesn't impact performance.

codecov-io commented 5 years ago

Codecov Report

Merging #825 into master will increase coverage by 0.62%. The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #825      +/-   ##
==========================================
+ Coverage   50.13%   50.75%   +0.62%     
==========================================
  Files          37       37              
  Lines        3441     3584     +143     
  Branches      495      554      +59     
==========================================
+ Hits         1725     1819      +94     
- Misses       1604     1644      +40     
- Partials      112      121       +9

Impacted Files	Coverage Δ
lib/carbon/cache.py	`80.92% <100%> (+0.25%)`	:arrow_up:
lib/carbon/protocols.py	`75.3% <0%> (+1.73%)`	:arrow_up:
lib/carbon/conf.py	`46.33% <0%> (+5.6%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e8725e4...a96726a. Read the comment docs.

DanCech commented 5 years ago

It looks like what is needed to solve the problem, I'll try running the benchmarks and see what I can find

deniszh commented 5 years ago

I'll try to create multicarbon configuration, maybe it will be better for benchmarking - comparing single carbon instance is hard.

piotr1212 commented 5 years ago

@deniszh I was referring to https://github.com/graphite-project/carbon/blob/master/lib/carbon/tests/benchmark_cache.py I've run it a bunch of times with and without locking on Python 2.7 and 3.7 but did not get consistent results.

tongyingrui commented 5 years ago

@piotr1212 I manually picked up this commit, but stilil get the same error, did I miss something?

07/12/2018 15:17:56 :: [console] Unhandled Error
Traceback (most recent call last):
  File "/opt/graphite/lib/python3.6/site-packages/twisted/python/threadpool.py", line 250, in inContext
    result = inContext.theWork()
  File "/opt/graphite/lib/python3.6/site-packages/twisted/python/threadpool.py", line 266, in <lambda>
    inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
  File "/opt/graphite/lib/python3.6/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/opt/graphite/lib/python3.6/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/opt/graphite/lib/carbon/writer.py", line 189, in writeForever
    writeCachedDataPoints()
  File "/opt/graphite/lib/carbon/writer.py", line 98, in writeCachedDataPoints
    (metric, datapoints) = cache.drain_metric()
  File "/opt/graphite/lib/carbon/cache.py", line 190, in drain_metric
    metric = self.strategy.choose_item()
  File "/opt/graphite/lib/carbon/cache.py", line 116, in choose_item
    return next(self.queue)
  File "/opt/graphite/lib/carbon/cache.py", line 104, in _generate_queue
    metric_counts = sorted(self.cache.counts, key=lambda x: x[1])
  File "/opt/graphite/lib/carbon/cache.py", line 163, in counts
    in self.items()]
  File "/opt/graphite/lib/carbon/cache.py", line 162, in <listcomp>
    return [(metric, len(datapoints)) for (metric, datapoints)
builtins.RuntimeError: dictionary changed size during iteration

piotr1212 commented 5 years ago

Sorry, seems to be still broken. I managed to reproduce it now.

piotr1212 commented 5 years ago

@tongyingrui Could you please test https://github.com/graphite-project/carbon/pull/829

tongyingrui commented 5 years ago

@tongyingrui Could you please test #829

Yes, sure.

And I think I can't reproduce it anymore. It has been running for 12 hours.

piotr1212 commented 5 years ago

@tongyingrui Thank you for testing. Did you notice any side-effect such as higher cpu usage, stalling?

tongyingrui commented 5 years ago

@tongyingrui Thank you for testing. Did you notice any side-effect such as higher cpu usage, stalling?

Sorry, I didn't pay much attention on the CPU usage before.

But at least I didn't see an extremely high cpu usage with the new commit.