erlang / otp

Erlang/OTP
http://erlang.org
Apache License 2.0
11.36k stars 2.95k forks source link

Erlang 27: ct_logs:custom_stylesheet_header/1 runs into an exception when the provided path does not exist #8911

Open michaelklishin opened 2 weeks ago

michaelklishin commented 2 weeks ago

In RabbitMQ test suites that pass on Erlang 26.2, on Erlang 27 I observe the following exception in ct_logs:

=== Info ===
Updating log files
Updating /home/runner/work/rabbitmq-server/rabbitmq-server/logs/index.html ... CRASHED while updating /home/runner/work/rabbitmq-server/rabbitmq-server/logs/index.html!
{function_clause,[{ct_logs,custom_stylesheet_header,
                           [{error,does_not_exist}],
                           [{file,"ct_logs.erl"},{line,1796}]},
                  {ct_logs,header,4,[{file,"ct_logs.erl"},{line,1755}]},
                  {ct_logs,all_suites_index_header,2,
                           [{file,"ct_logs.erl"},{line,1688}]},
                  {ct_logs,make_all_suites_index2,3,
                           [{file,"ct_logs.erl"},{line,2805}]},
                  {ct_logs,make_all_suites_index1,4,
                           [{file,"ct_logs.erl"},{line,2776}]},
                  {ct_logs,make_all_suites_index,2,
                           [{file,"ct_logs.erl"},{line,2507}]},
                  {ct_run,refresh_logs,2,[{file,"ct_run.erl"},{line,1515}]},
                  {ct_master_fork,refresh_logs,2,
                                  [{file,"ct_master_fork.erl"},{line,752}]}]}

which makes the run fail even though all tests have succeeded.

How to Reproduce

So far I was unable to pin down this phenomenon to a particular test suite. It does not affect the vast majority of suites, only two are affected:

Each group of suites in the examples above is executed in parallel.

ct_logs:custom_stylesheet_header/1 accepts an argument that's either a path, an undefined or unknown but it does not handle {error, _} cases.

The custom stylesheet argument is passed to ct_logs from ct_run:refresh_logs/2.

The value passed in can be coming from erlang.mk or RabbitMQ's own code but exactly the same parameters are used on Erlang 26.2 which does not run into this exception.

Possible Solutions

If making ct_logs:custom_stylesheet_header/1 more defensive by handling {error, _} or specifically {error, does_not_exist} would be considered acceptable, I'd be happy to submit a PR.

lhoguin commented 2 weeks ago

This is a ct_master related bug that doesn't affect the results of the tests. I plan to submit patches later on along other things (hence ct_master_fork in the stacktrace).

lhoguin commented 2 weeks ago

This is a tricky one. The crash occurs because it eventually calls this function: https://github.com/erlang/otp/blob/OTP-26.2.5.2/lib/common_test/src/ct_logs.erl#L207-L208 but the call/1 function returns the error tuple seen in the crash https://github.com/erlang/otp/blob/OTP-26.2.5.2/lib/common_test/src/ct_logs.erl#L245

I don't know yet if it's because the process stopped or because it isn't running at all (since this is ct_master and it has a ct_master_logs module).

lhoguin commented 2 weeks ago

I think the process isn't running because on the master node we only have ct_master, so ct_logs isn't ever needed (besides a few utility functions). I wonder if that refresh_logs call is even needed, I'll experiment without it.

lhoguin commented 1 week ago

I have opened a PR fixing this and more.