Closed bobheadxi closed 5 years ago
@vmarkovtsev there appears to be an issue with labours
even though nothing outward-facing has changed with this PR - do you have any ideas as to why this might be happening?
Yes, the problem is with Google's ortools
, which is a dependency of seriate
, which is a dependency of labours
. They released v7 on March 26th which is not backward compatible, and I did not specify the upper version constraint in seriate
. Fixing this now.
Thanks @vmarkovtsev looks like the build is passing now!
Reviewing this tomorrow :+1:
@vmarkovtsev real bugs should still propagate errors upwards for the caller to deal with - for example, using Hercules as a library in a web service, a user would definitely not want a fatal crash every time, but instead log and handle it safely, or at least get the chance to gracefully stop all connections before crashing. If it’s a service with state and buffers that needs to be written to disk, for example, a panic has a real danger of causing those things to be lost
If done right, returning errors will do the same thing as panic, since it’ll return up to the caller (for example, the Hercules command) and a panic can be done there, or simply exit with 1. Logger can be configured to capture stacktraces no errors as well
This is the thing, the places which I marked (I did not mark everything) are assertions which should never happen. The same as C++ assert or accessing a slice element out of bounds. If Go had macros, these would be disabled at all in the release build.
Hm... even then I believe it still better to propagate it properly, since sometimes unrelated things can cause certain impossibilities to happen, even if it doesn’t seem like it (at least it seems to be what happened here: https://github.com/src-d/hercules/pull/234#issuecomment-471200837)
Also feels good for the sake of consistency - out of bounds is one thing, but an unexpected invariant change feels like it should be returned properly at least based on what I understand of Go best practices, for the cases I mentioned earlier (in the event something goes wrong, even if it’s probably impossible, you want to give the caller the chance to clean up properly after itself, which could be critical - a panic interrupting a database write is less than pleasant)
If you feel strongly about keeping the panics I suppose I can change them back, but I would personally strongly recommend moving towards reducing the panics in the library component of Hercules (would be very helpful for me 😅)
Regarding catching panics in a lib, there is always recover()
at your service.
I know your pain, I used to suffer the same way when I used enry to detect languages, and it used to be very buggy and paniced all the time when used concurrently. After all, I could report proper panic stack traces to the maintainers, and they were able to fix problems very fast. I do understand that we can change those logging calls to include stack traces. If you know how to do it (print a fancy stack trace when a 🐛 appears), to me it is the same as panicking. So I am fine with that actually.
Thanks! I actually did not consider recover at all, so thank you for pointing that out. I’ll look into both solutions (log stacktrace + return error vs panic) and see might be best for these cases, and make the necessary changes. I’ll apply fixes for your other comments as well.
Thanks for the review!
@vmarkovtsev I've added a stacktrace capture to all calls to Error
and Errorf
. The output looks like this:
[ERROR] 2019/04/01 17:12:39 hello-world
[ERROR] 2019/04/01 17:12:39 stacktrace:
gopkg.in/src-d/hercules.v10/internal/core.TestLogger(0xc000178300)
/Users/robertlin/go/src/gopkg.in/src-d/hercules.v10/internal/core/logger_test.go:50 +0xbf4
testing.tRunner(0xc000178300, 0x170a140)
/usr/local/Cellar/go/1.12/libexec/src/testing/testing.go:865 +0xc0
created by testing.(*T).Run
/usr/local/Cellar/go/1.12/libexec/src/testing/testing.go:916 +0x357
does that look okay to you?
Yep, but let's make a separate level for such logs, e.g. "critical" or "fatal".
Hm, is there a particular reason why? My thoughts:
Critical()
breaks drop-in compatibility with many popular loggers (notably zap and logrus)Fatal()
means we either follow standard behaviour of calling os.Exit
, which means we're back to panic
-like behaviour, or not calling os.Exit
and introducing a behavioural inconsistency with popular loggers (zap, logrus, standard library all exit on Fatal
)Another thought: zap, by default, captures stacktraces on Error-level logs and above, so capturing stacktraces on Error()
isn't out of the ordinary:
https://github.com/uber-go/zap/blob/master/config.go#L67
// DisableStacktrace completely disables automatic stacktrace capturing. By
// default, stacktraces are captured for WarnLevel and above logs in
// development and ErrorLevel and above in production.
DisableStacktrace bool `json:"disableStacktrace" yaml:"disableStacktrace"`
Besides, a stacktrace should be interesting for all instances of l.Error
right? Would be interested in hearing your thoughts on this
Most of the errors do not require printing the trace and the log will be dumped otherwise.
I've added Critical
etc in https://github.com/src-d/hercules/pull/262/commits/c92aaacc8b5498a6d9f9b5699a2aed9ea904f85b
closes #244
This PR allows something like the following: (edit no longer possible as of https://github.com/src-d/hercules/pull/262/commits/c92aaacc8b5498a6d9f9b5699a2aed9ea904f85b - a wrapper needs to be written around zap)
I've added a
Logger
andConfig
variable forfacts
in packageinternal/core
, and also added the logger to all pipeline items and replaced the log calls that i could find.There are still a few calls to
panic
that I think we should try to remove in favour oferror
, but I noticed some tests actually assert that panics occur so I've tried not to change too much for now (notablyBurndownAnalysis
)