Closed crisidev closed 7 years ago
Hum, interesting.. I added prometheus client in travis file. Maybe I don't understand how the build and coverage check is performed. Am I missing something? :smile:
Go tip (https://travis-ci.org/kobolog/gorb/jobs/130436526) fails during testing:
--- FAIL: TestMustMarshalPanic (0.00s)
Error Trace: util_test.go:69
Error: func (assert.PanicTestFunc)(0x472300) should panic
Panic value: <nil>
Go 1.5 compiles fine.
@crisidev there is no panice in tip
, becasue tip
version supports json.Encoding of maps with non-string keys. So this test is not valid for tip
.
https://github.com/golang/go/commit/f05c3aa24d815cd3869153750c9875e35fc48a6e
At first glance you modify two global defined maps from different goroutines. Looks like data race
Interesting, I didn't figure out. Ok, I'll check and fix this.
Hi Matteo!
This is an amazing PR, thank you so much! Overall, I have a couple minor and one major comment: you seem not take any locks when you work with 1) notification channel maps in your code 2) accessing backends and such in the Context
. It's a race since multiple clients can come at the same time via HTTP handlers.
Thanks for bringing gorb to life 😉
I am not taking any lock as I did not figure out I was going into race accessing Context
.
I will add a locking mechanism in my goroutine calls and when I access Context
.
Do you have any preferred way to do it?
Hi Matteo,
Yeah, the race is introduced due to the fact that Go's built-in HTTP server essentially spawns a goroutine on every new HTTP request and these goroutines might execute concurrently, accessing the same map for reads and writes at the same time.
I don't have any preference for locking: the most straightforward way is to hold an RLock for reads and a normal Lock for writes and that's it.
One sidenote: in the high-performance code it's usually better to call Lock() and Unlock() manually instead of defer mutex.Unlock()
since defer
introduces a runtime penalty, but here it's not the case since GORB doesn't have any high-performance code paths.
Hey Andrey,
thanks for the help. I have done the change to my control channels and added locking when I read from shared map like ctx.backends
and ctx.services
.
I am not taking locks when I call GetService()
or GetBackend
since the lock is taken inside the called function. Do you think it is enough?
Regarding the error with travis build on go tip, I honestly don't know how to fix that test on go 1.5 and tip...
@@ master #27 diff @@
========================================
Files 13 13
Lines 218 194 -24
Methods 0 0
Branches 34 0 -34
========================================
- Hits 214 191 -23
+ Misses 3 2 -1
Partials 1 1
Powered by Codecov. Last updated by 2d8df45...3c3a284
I honestly don't know how to fix that test on go 1.5 and tip...
I think this test could be dropped because it's not valid anymore. Or you can use conditional build for the test function.
Hey, so how's it going with testing? I'm excited about merging this PR!
Very busy right now and there is a problem on locking on deletion. I will have time to work on that and fix it over the weekend. I'll be back ASAP! :) On 25 May 2016 8:32 p.m., "Andrey Sibiryov" notifications@github.com wrote:
Hey, so how's it going with testing? I'm excited about merging this PR!
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/kobolog/gorb/pull/27#issuecomment-221681936
Hey @crisidev, do you think you'll be able to get this PR going?
Hey @crisidev, do you think you'll get some time to get this PR ready to roll?
@kobolog I'll try to get that done during my Xmas holidays.. I don't have the test infrastructure anymore, hence I need to rebuild that first. I am sorry for keeping this so long..
Folk, I cannot reproduce the settings I had to fully test and fix this. I am sorry. I think it is better to close this PR.
I'll try to make another one sooner or later. Otherwise you can start from my idea here. Sorry about this.
This is how I instructed Gorb to produce metrics to be consumed by my Prometheus server.
Every time a service or a backend is created, a new time-serie is exposed. The related goroutine updates health metrics from the Pulse. Once the backed or the service is removed, the related time-serie is deleted to allow Prometheus knows about the change.
Gorb exposes Prometheus consumable metrics on http://:4672/metrics
Available timeseries
Example
Dashboard
This allows you to create a dashboard like this: