chillum / jsonmon

Quick and simple monitoring and alerting system
Apache License 2.0
27 stars 4 forks source link

Some improvements #17

Closed malud closed 7 years ago

malud commented 7 years ago

We are using jsonmon in our production environment. For that case we needed some improvements und small fixes. This PR is intended to be transparent as possible on all changes due to their commits.

So the key changes are:

malud commented 7 years ago

would fix #15 and #16

chillum commented 7 years ago

hey. thanks for sharing, I will take some time to review this.

at this moment I took a quick glance and the only thing I dislike at this moment is that godeps are moved out of the repo. why I don't like it:

  1. I think that Godep authors suggest to keep dependencies in target repo and it's considered the best practice
  2. This easens up third-party builds (like https://packager.io or Heroku)
malud commented 7 years ago

I have replaced Godeps with govendor since I had some issues with Godeps (Beta) lately and govendor is more mature. I really dislike the behaviour of Godeps changes my packages in $GOPATH on godeps restore which could result in detached heads and go get wouldn't work anymore. So this could be somewhat annoying while working on multiple projects. Depends on the way everyone is working with $GOPATH :)

chillum commented 7 years ago

frankly, I didn't use govendor, but now will take a look. is not keeping the vendored source files (not only their repo and hash) a best practice for govendor (my previous comment applies not to govendor vs godep, but to removing vendored deps from the repo)?

chillum commented 7 years ago

BTW, what is the reason to use godep restore at all? that is: you check in the dependencies in vendor and then only run update when you have a new version of dependencies in $GOPATH (and it updates the vendored copy from the GOPATH one, but not vice versa). I thought, that's the way to use godep + vendor.

chillum commented 7 years ago

I also suspect that your Makefile may work with GNU Make only (but currently don't have the BSD make at hand to check it). If that's really the case, we may want to rename Makefile to GNUmakefile

chillum commented 7 years ago

please explain me, why we don't need mutex locks. that is: we still have a global checks struct. concurrent reading (web ui) and writing to it (check) is unsafe (although, it's not necessarily will hit you in a short run). golang-nuts thread for reference: https://groups.google.com/forum/#!topic/golang-nuts/gWaAoFQKjaA

chillum commented 7 years ago

and a comment about the build system. I agree that may be environments that don't want the Ruby dependency. but on the other hand the new makefile-based system is not 100% equivalent. namely two things that personally I like in the old one:

  1. it runs go install to a standard location, so I'm able to run the latest binary from my $PATH on my workstation
  2. it puts the .zip files to ~/Downloads (personal preference again)

I don't think it's a very good idea to hardcode this behaviour to the new build system, so probably we want to support them both (that is: as for now I'm against removing the Ruby-based build scripts, but we may add the makefile as a second option).

malud commented 7 years ago

Maybe its my personal preference to have a dependency definition and pulling them if needed. So govendor doesn't work against the /vendor content. So I can re-add those files and we are good to go if you want.

The Makefile works indeed with GNU make only. I can rename this file.

The mutex topic: We have ignored this possible race condition due to the fact that there are no high frequency updates. I will add a test and the mutex lock.

chillum commented 7 years ago

frankly, about the dependencies:

  1. yes, I would like to keep the dependencies bundled
  2. I would like to keep godep, not govendor. it seems like godep is the standard, so I would like to keep using it unless we really have some problems with it.

if the godep restore is your only grief, I would like to keep godep (again: from my point of view you don't need this command at all in your workflow, but ymmv).

to illustrate my point: I eventually want to provide Linux Yum and Apt repos for installing and updating jsonmon and it seems like all the build solutions favor godep.

chillum commented 7 years ago

so, to sum this all:

  1. please revert the old Ruby build system. it's OK if we have both make and rake buildsystems.
  2. please revert godeps
  3. please revert mutexes. It took a long time for me to actually set up them right, although initially I was on 'it seems that nothing crashes without them' page as well. so the latest thing I want to do is just to throw them out. we still need them.

just saw your commit on mutexes. let me explain why I've done it like I did (one global lock for checks, not separate). we have a global []check array and we need to lock it all, not by parts. the previously mentioned golang-nuts thread explains, why this is essential.

it seems like I'm OK with other changes, but would like to review it all.

malud commented 7 years ago

The current difference between a global mutex and one per check is: it is very clear what object the mutex protects from the readability point of view. Technically it's the same as locking/unlocking them all per loop. Important to mention here is the Read Lock/Unlock since web requests would block each other and the checks themself. Feel free to play with the given test and enabled race detector. The checks are save now.

Further reading: https://hackernoon.com/dancing-with-go-s-mutexes-92407ae927bf

btw thanks for this discussion :)

chillum commented 7 years ago

thanks, will investigate the locks once more. it seems that /vendor is still blacklisted in .gitignore, that's trivial, but let's not forget this

chillum commented 7 years ago

I launched this version and have more comments. I assume by design that if we're logging to stdout, it's probably Linux. So I provide my stdout logs with systemd priority codes. <3> is error, <7> is debug, no tag is info and eq <6>.

I would strongly like that:

  1. We kept the message levels that used to be. For some reason the failed check is info now and used to be an error, for example.
  2. We added the new debug logs as <7>..., so it's possible to skip them when we don't want to see them or ship to logserver.
chillum commented 7 years ago

on my tests your version uses slightly more memory. It can be described with the fact that it uses a different data structure: as one particular change, it always sets name (to shell or web).

please elaborate on why do we need this.

malud commented 7 years ago

Name gets set only once if empty and is used for logging to have an idea which check fails. Below this the endless loop is running with given interval.

chillum commented 7 years ago

launched the latest code:

  1. now we have double or triple newlines between log entries
  2. not sure if or are journald-compatible priorities, need to check
chillum commented 7 years ago
  1. still too much infos, e.g. Running web check, retry # 1 is definitely debug
malud commented 7 years ago

1 & 3 are fixed

  1. The loglevels are taken from syslog package and since the logger calling the underlying methods they should definitively match.
chillum commented 7 years ago

in this version I'm really missing the 'check is back normal' log message (it was notice)