Closed ghaskins closed 2 months ago
@ghaskins Hi Gregory, thanks for pinging about this. Apologies for the trouble!
Busy investigating, in the meantime - any idea how frequently that error's showing up? Is it a rare occurrence?
@ptaoussanis is 100% reproducible within a unit test, but it just started happening, and I am not sure why. I'll see if I can narrow it down or get a simple reproduction case for you.
To clarify- when you say "100% reproducible", you mean the warning is printed for every log call?
Sorry, no: I mean I started seeing at least a handful of these messages for 100% of running the unit-test of one of my apps. I am not sure if it's every log or not.
One thing I notice that is curious is it suggests maybe it is hanging getting the hostname, and then it eventually times out and hits a secondary issue trying to print the error. I may be misinterpretting the stack, though.
One other oddity is that it happens both locally on my workstation and CI, ruling out an env-specific issue.
Preliminary investigation doesn't show any obvious reason why you'd be seeing that error. From your stack, it looks like something is interrupting your handler thread while it's waiting on an updated hostname.
In principle that should only occur during shutdown, but I might be missing something obvious.
I've just pushed Encore v3.97.1 which fixes a minor type hinting bug in the region, though that's a long shot and I wouldn't expect it to cause the error you're seeing.
About to sign off for the evening, but will continue investigating first thing tomorrow.
In principle that should only occur during shutdown, but I might be missing something obvious.
That is an interesting insight. There is a reasonable chance it only happens during shutdown (in this case, the fixture is destroying the UT environment). I'll pull your latest and try it out.
encore 3.97.1 did NOT resolve the issue. Seems to behave the same.
One more datapoint. I bisected and found that the problem was not happening with timbre 6.3.1, and then I made a directed test, as so:
diff --git a/project.clj b/project.clj
index de239e7..9a5c37c 100644
--- a/project.clj
+++ b/project.clj
@@ -48,7 +48,8 @@
[io.undertow/undertow-core "2.3.10.Final"]
[io.undertow/undertow-servlet "2.3.10.Final"]
[com.google.protobuf/protobuf-java "3.25.2"]
- [com.taoensso/timbre "6.3.1"]
+ [com.taoensso/encore "3.97.0"]
+ [com.taoensso/timbre "6.5.0"]
and that alone reproduced the problem (FWIW). I will confirm what version of encore was getting pulled into the "good" case and report back.
In the case that works with [com.taoensso/timbre "6.3.1"], it is pulling in [com.taoensso/encore "3.74.0"] via transitive deps.
The hostname error in your stacktrace will be related to this which is when Timbre (v6.5.0) switched to a more sophisticated hostname util provided by Encore.
The question is what's causing the error.
From your stacktrace, it looks like the thread awaiting hostname info is being interrupted.
I can't think of any explanation for that, besides an interrupt due to hard system shutdown.
The things that'd be most helpful atm would be to know-
Otherwise will take a look again tomorrow with fresh eyes.
Thanks for your assistance with this, much appreciated.
Okay, took a fresh look.
Observations from looking at your stack trace:
error-fn
is failing on some data+err (cause not logged [1])output-fn
had failed on some data (cause is logged) [2]@hostname_
failed due to awaiting (handler) thread being interruptedPossible explanation:
@hostname_
to be interrupted, triggering CountDownLatch InterruptedException [2].error-fn
to fail here [1].So my current best guess is that your unit tests are triggering some sort system/pool shutdown that is interrupting a logging call blocked on getting a hostname. (Which would btw imply that the hostname cache is cold, which might imply an issue with your system's hostname).
So far that would be expected behaviour. The piece of the story missing is why your error-fn
would be failing.
Unfortunately that cause [1] isn't currently captured.
It COULD be that the error-fn
is also failing because of some issue with IT fetching a hostname - though that shouldn't happen, and I can't think of any obvious reason why it should (the hostname fn should never throw). Another possible explanation would be that the library Timbre uses for formatting exceptions (and that was also updated in Timbre v6.5.0) is somehow choking.
I'll cut a debug build for you which logs [1]. Unfortunately I'd need to ask you to try that on your side, since I'm otherwise out of ideas.
I've just pushed [com.taoensso.debug/timbre "6.5.0-issue-389a"]
to Clojars (please note the .debug) - that'll add info about error [1] to your log output, which should help us figure out what's going on.
Please try that when convenient, and share your new logging output (esp. from "[TIMBRE WARNING]") π
I will do that; thank you for your diligence on this!
@ptaoussanis I work with @ghaskins and encountered the same issue. I've attached two logs - with the original 6.5.0
and with the 6.5.0-issue-389a
.
[timbre-exception.6.5.0.log](https://github.com/taoensso/timbre/files/14923380/timbre-exception.6.5.0.log)
[timbre-exception.6.5.0-issue-389a.log](https://github.com/taoensso/timbre/files/14923381/timbre-exception.6.5.0-issue-389a.log)
Thanks @muralisrini!
Could you please try again and share your logs with both the following dependencies specified in this exact order:
[com.taoensso.debug/encore "3.98.0-timbre-issue-389a"] ; 1st
[com.taoensso.debug/timbre "6.5.0-issue-389a"] ; 2nd
(Note the .debug
in the group name).
My current understanding of what's going on in your case:
The debug Encore build above is intended to address point 4 - by catching interrupts internally.
If I'm right about what's going on, then this should hopefully resolve the problem and eliminate the "[TIMBRE WARNING]" log output.
Please let me know if it does, and I'll cut a stable Encore release π
Update: Encore v3.99.0
is now up and includes the above change since it'll anyway be nice to have.
Unfortunately, @ptaoussanis I think the issue is still there ? Attaching log..
timbre-exception.3.98.0-timbre-issue-389a-6.5.0-issue-389a.log
I made sure I was using the versions you indicated:
;[com.taoensso/encore "3.90.0"]
;[com.taoensso/timbre "6.5.0"]
[com.taoensso.debug/encore "3.98.0-timbre-issue-389a"] ; 1st
[com.taoensso.debug/timbre "6.5.0-issue-389a"] ; 2nd
It looks like you must somehow still be pulling in an old version of Encore (e.g. v3.90.0)-
[taoensso.encore$eval8838$get_hostname__8839 invokePrim "encore.cljc" 5670]
This line number (5670) doesn't correspond to the latest versions ("3.98.0-timbre-issue-389a" or "3.99.0").
Is it possible you have a dependency conflict? What are you using for your dependencies?
If Leiningen, you can try to run lein deps :tree
to see what version of Encore's actually being used.
If it really shows "3.98.0-timbre-issue-389a" or "3.99.0", then it might be some sort of caching issue with your build/test environment.
I am using Leiningen. Did some digging based on the clue that line numbers don't match. Turned out we are using another lib
com.fzakaria/slf4j-timbre "0.4.1"]
which appears to be the cause of this issue. The dependency difference with and without that lib:
diff /tmp/deps.zack /tmp/deps.nozack
329,330d328
< [com.fzakaria/slf4j-timbre "0.4.1"] -> [com.taoensso/timbre "6.3.1"] -> [io.aviso/pretty "1.4.4"]
< and
336d333
< [com.fzakaria/slf4j-timbre "0.4.1" :exclusions [io.aviso/pretty]]
570d566
< [com.fzakaria/slf4j-timbre "0.4.1"]
It also turns out that I don't see the exception traces when I move to earlier version of that lib [com.fzakaria/slf4j-timbre "0.3.21"]
. The downside is it appears to pull in older versions of your libs:
lein deps
Retrieving com/taoensso/timbre/4.10.0/timbre-4.10.0.pom from clojars
Retrieving com/taoensso/encore/2.91.0/encore-2.91.0.pom from clojars
In any case appears the issue should likely be filed against that library (@ghaskins wdyt ?).
There maybe benefit to having the fixes you put into the debug jars ? Leave that entirely to you.
(Thank you for your prompt responses and help uncovering root cause!)
I just realized that because of the name change to add 'debug", it is now not fully excluding other older versions of timbre...Once I fixed that, it now all works (the problem is resolved)
In short, making this change
diff --git a/project.clj b/project.clj
index 1b06f2e..21f3480 100644
--- a/project.clj
+++ b/project.clj
@@ -13,6 +13,8 @@
:password [:gpg :env/gitlab_token_value]}]]
:dependencies [[org.clojure/clojure "1.11.2"]
[org.clojure/core.async "1.6.681"]
+ [com.taoensso.debug/encore "3.98.0-timbre-issue-389a"] ; 1st
+ [com.taoensso.debug/timbre "6.5.0-issue-389a"] ; 2nd
[org.javassist/javassist "3.29.2-GA"]
[cheshire "5.13.0"]
[com.google.guava/guava "32.0.1-jre"]
@@ -37,20 +39,18 @@
[io.github.protojure/grpc-client "2.8.2"]
[io.github.protojure/grpc-server "2.8.2"]
[io.github.protojure/google.protobuf "2.0.1"]
- [io.github.manetu/temporal-sdk "0.20.1"]
+ [io.github.manetu/temporal-sdk "0.20.1" :exclusions [[com.taoensso/timbre]]]
[manetu/temporal-utils "1.2.0-2"]
[manetu.platform/protos "1.9.0-28"]
[manetu/utils "1.0.0-18"]
- [manetu/policyengine "2.1.0-16"]
+ [manetu/policyengine "2.1.0-16" :exclusions [[com.taoensso/timbre]]]
[manetu/security "2.0.0-19"]
- [manetu/functional-opa "1.0.0-5"]
- [manetu/dbutils "1.0.0-3"]
+ [manetu/functional-opa "1.0.0-5" :exclusions [[com.taoensso/timbre]]]
+ [manetu/dbutils "1.0.0-3" :exclusions [[com.taoensso/timbre]]]
[io.undertow/undertow-core "2.3.12.Final"]
[io.undertow/undertow-servlet "2.3.12.Final"]
- [com.google.protobuf/protobuf-java "3.25.3"]
- [com.taoensso/encore "3.97.0"]
- [com.taoensso/timbre "6.5.0"]
- [com.fzakaria/slf4j-timbre "0.4.1"]
+ [com.google.protobuf/protobuf-java "3.25.3"]
+ [com.fzakaria/slf4j-timbre "0.4.1" :exclusions [[com.taoensso/timbre]]]
[org.slf4j/jul-to-slf4j "2.0.12"]
[org.slf4j/jcl-over-slf4j "2.0.12"]
[org.slf4j/log4j-over-slf4j "2.0.12"]
and I now no longer see the issue
I have also confirmed that simply doing
diff --git a/project.clj b/project.clj
index 1b06f2e..ae317ae 100644
--- a/project.clj
+++ b/project.clj
@@ -48,7 +48,7 @@
[io.undertow/undertow-core "2.3.12.Final"]
[io.undertow/undertow-servlet "2.3.12.Final"]
[com.google.protobuf/protobuf-java "3.25.3"]
- [com.taoensso/encore "3.97.0"]
+ [com.taoensso/encore "3.99.0"]
[com.taoensso/timbre "6.5.0"]
[com.fzakaria/slf4j-timbre "0.4.1"]
[org.slf4j/jul-to-slf4j "2.0.12"]
Also fixes the issue. (Thank you!)
@muralisrini Thanks for the extra info π
I'm pretty confident that if you have Encore "3.98.0-timbre-issue-389a" or "3.99.0" running, the "[TIMBRE WARNING]" issue on shutdown should be gone.
I don't imagine slf4j-timbre should have any other impact besides possibly interfering with getting the correct dependencies, though I can't say for sure since I'm unfamiliar with its internals.
I would note that the way it depends on Timbre core is unfortunately a little fragile, and there may be at least one known issue with the current version in some cases. As mentioned on that thread - I am planning to integrate SLF4J support directly into Timbre core for Timbre's next major release.
In the meantime, I'd suggest one of the following sets of ordered dependencies:
;; Option 1
[...
[com.taoensso/encore "3.99.0"] ; Latest
[com.taoensso/timbre "6.5.0"] ; Latest
[com.fzakaria/slf4j-timbre "0.4.1"] ; Latest
...]
;; Option 2
[...
[com.taoensso/encore "3.99.0"] ; Latest
[com.taoensso/timbre "6.3.1"] ; Latest known to work with slf4j-timbre
[com.fzakaria/slf4j-timbre "0.4.1"] ; Latest
...]
So long as you ensure that it's actually Encore v3.99.0 running (check with lein deps :tree
), I believe either of these should resolve the current issue.
Option 1 is preferable if possible, but I'm not sure if slf4j-timbre will be happy with Timbre 6.5.0. In case it's not, option 2 should work - at least until SLF4J support is moved into Timbre core.
Ah, race condition on the last few replies :-)
@ghaskins I have also confirmed that simply doing
Great, that's ideal π
Again, my apologies for all the time + trouble getting to the bottom of this.
@muralisrini Thanks for the extra info π
@ptaoussanis more like extra mis-info :-) ... Confirm "3.99.0" fixes for me as well.
Thank you :pray:
Again, my apologies for all the time + trouble getting to the bottom of this.
Please, no apologies are necessary. Your libraries and your support of them are top-notch. Thank you!
Confirmed fixed in encore 3.99.0
@muralisrini more like extra mis-info :-)
Not at all. Dependency conflicts are unfortunately common, tough to spot, and can create all sorts of weird confusion when trying to debug.
@ghaskins Please, no apologies are necessary. Your libraries and your support of them are top-notch. Thank you!
Still, I understand the frustration of trying to run a business when library updates unexpectedly break or otherwise cause unintended headaches.
I'm happy to have been able to help, please never hesitate to ping π
Hello,
I started seeing the following error (timbre 6.5.0 and encore 3.97.0), but I am unsure what changed.