Closed samsondav closed 5 years ago
So I cannot reproduce this. Here is a snippet that defines 100 test modules with 100 tests each:
require Logger
ExUnit.start(capture_log: true)
for i <- 1..100 do
defmodule Module.concat(["Test#{i}"]) do
use ExUnit.Case, async: :rand.uniform > 0.5
for i <- 1..100 do
test :"#{i}" do
Logger.debug "ok"
end
end
end
end
It passes just fine.
Looking at the code, the only situation I could see this happening is if the Logger supervisor restarts, maybe because a custom backend is failing? Are you using any custom backends? Does removing them reflect the failures?
We do not use any custom logger backends. Our setup looks like this:
config :logger, :console, metadata: :all
config :logger, :console, level: :error
NB: We do use a custom backend in prod mode, but not in test mode.
@josevalim I'm not really sure what else I can give you. The failures are intermittent - sometimes a test will fail and sometimes it will pass. But it is always the same tests that flicker.
Is there anything else I can share with you that will help us to debug this?
Maybe one option is for you to add a file logger backend and then see if the file backend has logged any useful information because the issue in cases like this is that something else may have been logged but we wouldn't know because there is no logger backend.
I managed to reproduce it:
require Logger
Application.put_env(:logger, :console, [
format: "\n$time $metadata[$level] $levelpad$message\n",
metadata: :all
])
ExUnit.start(capture_log: true)
defmodule Module.CaptureLogError do
use ExUnit.Case
test "boom!" do
Logger.metadata(boom: <<1::1>>)
Logger.info "boom!"
end
end
@samphilipd it should not fail if you remove the metadata: :all
config. The next step would be to figure out what's killing the capture log handler.
Metadata has to be printable. But that shouldn’t kill the handler either. --
José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D
Here is my hypothesis.
The StringIO
device does not handle well those cases of non-printable metadata, raising an un-rescued ArgumentError
when calling :unicode.characters_to_binary/3
, and killing the device process:
The handler monitors the device when logging, and if the device goes down, the handler raises:
which I believe kills the handler.
This doesn't happen with the :user
device, which replies with an {:error, {:put_chars, :unicode, <bad data>}}
if you provide non-printable chars.
I think everything should work if the ArgumentError
is rescued in StringIO
.
Thoughts?
Awesome, you got it:
iex(2)> :io.put_chars <<1::1>>
** (ArgumentError) argument error
(stdlib) :io.put_chars(#PID<0.63.0>, :unicode, <<1::size(1)>>)
iex(2)> {:ok, pid} = StringIO.open("")
{:ok, #PID<0.112.0>}
iex(3)> :io.put_chars pid, <<1::1>>
10:15:57.990 [error] GenServer #PID<0.112.0> terminating
** (ArgumentError) argument error
(stdlib) :unicode.characters_to_binary(<<1::size(1)>>, :unicode, :unicode)
(elixir) lib/string_io.ex:275: StringIO.put_chars/4
(elixir) lib/string_io.ex:199: StringIO.io_request/4
(elixir) lib/string_io.ex:177: StringIO.handle_info/2
(stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:711: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: {:io_request, #PID<0.105.0>, #Reference<0.656754156.1514405889.157709>, {:put_chars, :unicode, <<1::size(1)>>}}
State: %{capture_prompt: false, input: "", output: ""}
** (EXIT from #PID<0.105.0>) shell process exited with reason: an exception was raised:
** (ArgumentError) argument error
(stdlib) :unicode.characters_to_binary(<<1::size(1)>>, :unicode, :unicode)
(elixir) lib/string_io.ex:275: StringIO.put_chars/4
(elixir) lib/string_io.ex:199: StringIO.io_request/4
(elixir) lib/string_io.ex:177: StringIO.handle_info/2
(stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:711: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
And I know the default IO does rescue unicode conversion errors and return {:error, :badarg}
from them, so you are correct regarding the fix too.
The fix for non-printable data in Logger's metadata is fixed in master.
@samphilipd could you confirm that was the issue in your case?
Closing as it has been fixed.
@josevalim I think we are facing a regression or similar bug:
Elixir 1.9.4 (compiled with Erlang/OTP 22)
0) ImaginatorWeb.TransformControllerTest: failure on setup_all callback, all tests have been invalidated
** (EXIT from #PID<0.763.0>) exited in: ExUnit.CaptureLog.remove_capture(#PID<0.764.0>)
** (EXIT) {:error, :module_not_found}
0) ImaginatorWeb.InfoControllerTest: failure on setup_all callback, all tests have been invalidated
** (EXIT from #PID<0.777.0>) exited in: ExUnit.CaptureLog.remove_capture(#PID<0.778.0>)
** (EXIT) {:error, :module_not_found}
We are using unmodified generated ConnCase and do not add any setup_all callbacks.
Removing config :logger, :console, metadata: :all
solves the issue but we can't do that as we have assertions on Logger metadata.
Do you have any way to reproduce the error? You can try running the suite with SASL logs enabled, as they may tell the actual reason why the handler was removed. --
José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D
@josevalim So the issue is that we have two backends configured in our app, LoggerJSON which we use in production and :console
which is used in test environment, test.exs
:
config :logger,
backends: [:console],
level: :info
But Elixir :console
logger backend still receives fragments encoded by LoggerJSON.Plug
as an event:
ERROR REPORT==== 22-Dec-2019::18:57:05.495250 ===
** gen_event handler {'Elixir.Logger.Backends.Console',<0.973.0>} crashed.
** Was installed in 'Elixir.Logger'
** Last event was: {info,<0.62.0>,
{'Elixir.Logger',<<>>,
{{2019,12,22},{11,57,5,495}},
[{pid,<0.975.0>},
{node,
#{'__struct__' => 'Elixir.Jason.Fragment',
encode =>
#Fun<Elixir.LoggerJSON.Plug.MetadataFormatters.GoogleCloudLogger.1.58295238>}},
{httpRequest,
#{'__struct__' => 'Elixir.Jason.Fragment',
encode =>
#Fun<Elixir.LoggerJSON.Plug.MetadataFormatters.GoogleCloudLogger.0.58295238>}},
{phoenix,
#{'__struct__' => 'Elixir.Jason.Fragment',
encode =>
#Fun<Elixir.LoggerJSON.Plug.MetadataFormatters.GoogleCloudLogger.3.58295238>}},
{line,44},
{function,<<"call/2">>},
{module,'Elixir.LoggerJSON.Plug'},
{file,<<"lib/logger_json/plug.ex">>},
{application,logger_json},
{request_id,<<"FeKvibPMtoinvXMAAAiK">>}]}}
** When handler state == #{'__struct__' => 'Elixir.Logger.Backends.Console',
buffer => [],buffer_size => 0,
colors =>
#{debug => cyan,enabled => true,error => red,
info => normal,warn => yellow},
device => <0.973.0>,
format =>
[<<"\n">>,time,<<" ">>,metadata,<<"[">>,level,
<<"] ">>,levelpad,message,<<"\n">>],
level => nil,max_buffer => 32,metadata => all,
output => nil,ref => nil}
** Reason == {#{'__exception__' => true,
'__struct__' => 'Elixir.Protocol.UndefinedError',
description => <<>>,protocol => 'Elixir.String.Chars',
value =>
#{'__struct__' => 'Elixir.Jason.Fragment',
encode =>
#Fun<Elixir.LoggerJSON.Plug.MetadataFormatters.GoogleCloudLogger.1.58295238>}},
[{'Elixir.String.Chars','impl_for!',1,
[{file,"lib/string/chars.ex"},{line,3}]},
{'Elixir.String.Chars',to_string,1,
[{file,"lib/string/chars.ex"},{line,22}]},
{'Elixir.Logger.Formatter','-output/5-fun-0-',1,
[{file,"lib/logger/formatter.ex"},{line,166}]},
{'Elixir.Enum','-map/2-lists^map/1-0-',2,
[{file,"lib/enum.ex"},{line,1336}]},
{'Elixir.Enum','-map/2-lists^map/1-0-',2,
[{file,"lib/enum.ex"},{line,1336}]},
{'Elixir.Logger.Formatter','-format/5-fun-0-',6,
[{file,"lib/logger/formatter.ex"},{line,152}]},
So, in the end, it crashes because it can't encode Jason.Fragment
prepared by plug.
As far as I know if you have the LoggerJSON plug in your pipeline, it will always add Jason metadata. So you need to configure :metadata accordingly instead of :all.
That is true and it's purely application error. Thank you
I am having the exact same problem with elixir 1.10.0.
$ elixir --version
Erlang/OTP 22 [erts-10.5.5] [source] [64-bit] [smp:6:6] [ds:6:6:10] [async-threads:1] [hipe]
Elixir 1.10.0 (compiled with Erlang/OTP 22)
I made a minimal project with the code above and here is what I get:
$ mix test
0) Myapp.CaptureLogErrorTest: failure on setup_all callback, all tests have been invalidated
** (EXIT from #PID<0.150.0>) exited in: ExUnit.CaptureLog.remove_capture(#PID<0.151.0>)
** (EXIT) {:error, :module_not_found}
Finished in 0.05 seconds
0 failures
Randomized with seed 548524
Can you reproduce?
Thanks, I was able to reproduce it. It is related to #9814 and it has been fixed on master and v1.10 branch. We will release v1.10.2 in a week with the fix.
@josevalim Could there be another regression? We have the same issue with quantum. (I did not verify it myself yet.)
Environment
Current behavior
I changed this line:
ExUnit.start()
to
ExUnit.start(capture_log: true)
in order to silence log output during tests. However, some tests are now failing with this error message:
Any ideas?