processone / eimp

Erlang Image Manipulation Process
Apache License 2.0
48 stars 20 forks source link

failed armel build of erlang-p1-eimp 1.0.23-1 #19

Open debalance opened 1 month ago

debalance commented 1 month ago

eimp repeatedly failed to FTBFS on armel: https://buildd.debian.org/status/package.php?p=erlang-p1-eimp&suite=sid

The eunit test png_to_gif_test times out. As armel is a rather low power system, I would like to know what the timeout limit is and if (and how) it can be increased.

Please note that this currently blocks providing the latest ejabberd release for Debian users.

badlop commented 1 month ago

Looking at one of the test log files, the error message is

``` ======================== EUnit ======================== module 'eimp_app' module 'eimp_sup' module 'eimp_limit' module 'eimp' module 'eimp_worker' module 'eimp_test' eimp_test: start_test...[0.043 s] ok eimp_test: is_supported_test...ok eimp_test: supported_formats_test...ok eimp_test: png_to_jpeg_test...[0.130 s] ok eimp_test: png_to_webp_test...[0.766 s] ok eimp_test: png_to_gif_test...*timed out* in function eimp_worker:do_call/6 in call from eimp:call/1 (eimp.erl, line 169) in call from eimp_test:convert/3 (eimp_test.erl, line 174) in call from eunit_test:'-mf_wrapper/2-fun-0-'/2 (eunit_test.erl, line 273) in call from eunit_test:run_testfun/1 (eunit_test.erl, line 71) in call from eunit_proc:run_test/1 (eunit_proc.erl, line 531) in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 356) in call from eunit_proc:handle_test/2 (eunit_proc.erl, line 514) undefined ======================================================= Failed: 0. Skipped: 0. Passed: 5. One or more tests were cancelled. Cover analysis: /<>/.eunit/index.html Coverdata export: /<>/.eunit/cover.coverdata =INFO REPORT==== 9-Oct-2024::05:42:46.454624 === application: p1_utils exited: killed type: temporary =INFO REPORT==== 9-Oct-2024::05:42:46.455167 === application: eimp exited: killed type: temporary ERROR: One or more eunit tests failed. ERROR: eunit failed while processing /<>: rebar_abort make[2]: *** [Makefile:35: test] Error 1 make[2]: Leaving directory '/<>' make[1]: *** [debian/rules:29: override_dh_auto_test] Error 2 make[1]: Leaving directory '/<>' make: *** [debian/rules:12: binary-arch] Error 2 dpkg-buildpackage: error: debian/rules binary-arch subprocess returned exit status 2 -------------------------------------------------------------------------------- Build finished at 2024-10-09T05:42:46Z Finished -------- ```

In my test machine I was able to make that testcase fail by decreasing CALL_TIMEOUT in eimp_worker:

diff --git a/src/eimp_worker.erl b/src/eimp_worker.erl
index 17e1e2a..48945fc 100644
--- a/src/eimp_worker.erl
+++ b/src/eimp_worker.erl
@@ -27,7 +27,7 @@
     terminate/2, code_change/3]).

 -define(MAX_RETRIES, 2).
--define(CALL_TIMEOUT, 30000).
+-define(CALL_TIMEOUT, 100).

 -record(state, {port :: undefined | port(),
        links = sets:new() :: sets:set(),

However, in my case the error message does not look identical:

``` # ./configure; rebar3 xref; rebar3 eunit -v ... ===> Performing EUnit tests... ======================== EUnit ======================== file "eimp.app" application 'eimp' module 'eimp' module 'eimp_app' module 'eimp_limit' module 'eimp_sup' module 'eimp_worker' [done in 0.006 s] [done in 0.009 s] module 'eimp_test' eimp_test: start_test...[0.007 s] ok eimp_test: is_supported_test...ok eimp_test: supported_formats_test...ok eimp_test: png_to_jpeg_test...[0.032 s] ok eimp_test: png_to_webp_test...*failed* in function eimp_test:convert/3 (/home/bernar/e/git/librerias/eimp/test/eimp_test.erl, line 174) in call from eunit_test:'-mf_wrapper/2-fun-0-'/2 (eunit_test.erl, line 274) in call from eunit_test:run_testfun/1 (eunit_test.erl, line 72) in call from eunit_proc:run_test/1 (eunit_proc.erl, line 544) in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 369) in call from eunit_proc:handle_test/2 (eunit_proc.erl, line 527) in call from eunit_proc:tests_inorder/3 (eunit_proc.erl, line 469) in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 359) **error:{badmatch,{error,timeout}} output:<<"">> eimp_test: png_to_gif_test...[0.059 s] ok eimp_test: jpeg_to_png_test...*failed* in function eimp_test:convert/3 (/home/bernar/e/git/librerias/eimp/test/eimp_test.erl, line 174) in call from eunit_test:'-mf_wrapper/2-fun-0-'/2 (eunit_test.erl, line 274) in call from eunit_test:run_testfun/1 (eunit_test.erl, line 72) in call from eunit_proc:run_test/1 (eunit_proc.erl, line 544) in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 369) in call from eunit_proc:handle_test/2 (eunit_proc.erl, line 527) in call from eunit_proc:tests_inorder/3 (eunit_proc.erl, line 469) in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 359) **error:{badmatch,{error,timeout}} output:<<"">> ... ```

Let's hope the differences in the error message are related to the Erlang-EUnit version (I used Erlang 26.3 and 27.1), or the test environment...

debalance commented 1 month ago

I managed to reproduce the error in an armel chroot on https://db.debian.org/machines.cgi?host=amdahl.

Unfortunately, even massively increasing that timeout (to 90000000) does not help, so the issue is something else. Any pointers on how I can further debug the problem?

``` ======================== EUnit ======================== module 'eimp_limit' module 'eimp' module 'eimp_test' eimp_test: start_test...[0.021 s] ok eimp_test: is_supported_test...ok eimp_test: supported_formats_test...ok eimp_test: png_to_jpeg_test...[0.087 s] ok eimp_test: png_to_webp_test...[0.375 s] ok eimp_test: png_to_gif_test...*timed out* in function eimp_worker:do_call/6 in call from eimp:call/1 (eimp.erl, line 169) in call from eimp_test:convert/3 (eimp_test.erl, line 174) in call from eunit_test:'-mf_wrapper/2-fun-0-'/2 (eunit_test.erl, line 273) in call from eunit_test:run_testfun/1 (eunit_test.erl, line 71) in call from eunit_proc:run_test/1 (eunit_proc.erl, line 531) in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 356) in call from eunit_proc:handle_test/2 (eunit_proc.erl, line 514) undefined ======================================================= Failed: 0. Skipped: 0. Passed: 5. One or more tests were cancelled. Cover analysis: /home/debalance/erlang-p1-eimp-1.0.23/.eunit/index.html Coverdata export: /home/debalance/erlang-p1-eimp-1.0.23/.eunit/cover.coverdata =INFO REPORT==== 27-Oct-2024::08:54:51.238735 === application: p1_utils exited: killed type: temporary =INFO REPORT==== 27-Oct-2024::08:54:51.238951 === application: eimp exited: killed type: temporary ERROR: One or more eunit tests failed. ERROR: eunit failed while processing /home/debalance/erlang-p1-eimp-1.0.23: rebar_abort make[2]: *** [Makefile:35: test] Error 1 make[2]: Leaving directory '/home/debalance/erlang-p1-eimp-1.0.23' make[1]: *** [debian/rules:29: override_dh_auto_test] Error 2 make[1]: Leaving directory '/home/debalance/erlang-p1-eimp-1.0.23' make: *** [debian/rules:12: binary] Error 2 dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2 ```
prefiks commented 1 month ago

It looks like issue with gif format writing, reading png seems to be working (as previous tests that uses this same file passes), so it probably problem with writing part. May be just problem with conversion to 256 colors that gif require?

Maybe you could try checking if smaller images pass this test, by replacing test/img.png with something smaller? And if that doesn't work we could just disable that gif test on that platform, i could try adding code that could do that.

debalance commented 1 month ago

I will give it a try.

badlop commented 3 weeks ago

Good news! I looked again with more detail, and found where to increase the timeout.

This is a long explanation, if in a hurry, go to the Increase timeout section:

Who defines that timeout?

We run rebar3 (or rebar), which runs EUnit, which runs eimp erlang code, which runs eimp C code. Let's see who measures times and what are their timeout values:

EUnit documentation says:

The default timeout for an individual test is 5 seconds.

Introduce a delay

With that in mind, let's add a delay in all tests:

--- a/src/eimp_worker.erl
+++ b/src/eimp_worker.erl
@@ -49,6 +49,7 @@ call(Data) ->
 -spec call(binary(), non_neg_integer()) ->
          {ok, binary()} | {error, eimp:error_reason()}.
 call(Data, Timeout) ->
+    timer:sleep(5000),
     case eimp:is_gd_compiled() of
    true ->
        StartTime = p1_time_compat:monotonic_time(milli_seconds),

The error message that is thrown looks close to what you get:

```sh $ REBAR=rebar make test ... ======================== EUnit ======================== module 'eimp_worker' module 'eimp_sup' module 'eimp' module 'eimp_test' eimp_test: start_test...[0.012 s] ok eimp_test: is_supported_test...ok eimp_test: supported_formats_test...ok eimp_test: png_to_jpeg_test...*timed out* in function timer:sleep/1 (timer.erl, line 235) in call from eimp_worker:call/2 (src/eimp_worker.erl, line 52) in call from eimp:call/1 (src/eimp.erl, line 169) in call from eimp_test:convert/3 (test/eimp_test.erl, line 174) in call from eunit_test:'-mf_wrapper/2-fun-0-'/2 (eunit_test.erl, line 273) in call from eunit_test:run_testfun/1 (eunit_test.erl, line 71) in call from eunit_proc:run_test/1 (eunit_proc.erl, line 543) in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 368) undefined ```

We see a *timed out* error message when sleeping 5 seconds. If sleep is only 4 seconds or less, then the test suceed. This confirms the timeout is detected by EUnit, evn if it is caused by a delay in eimp.

Increase timeout

How to change that timeout?

Let's add the line:

--- a/rebar.config
+++ b/rebar.config
@@ -42,6 +42,8 @@

 {xref_checks, [undefined_function_calls, undefined_functions, deprecated_function_calls, deprecated_functions]}.

+{eunit_opts, [{scale_timeouts, 10}]}.
+
 %% Local Variables:
 %% mode: erlang
 %% End:

Results

```sh $ REBAR=rebar make test ... module 'eimp_test' eimp_test: start_test...[0.018 s] ok eimp_test: is_supported_test...ok eimp_test: supported_formats_test...ok eimp_test: png_to_jpeg_test...[5.068 s] ok eimp_test: png_to_webp_test...[5.175 s] ok eimp_test: png_to_gif_test...[5.061 s] ok eimp_test: png_to_gif1_test...[5.101 s] ok eimp_test: png_to_gif2_test...[5.059 s] ok ... All 41 tests passed. $ REBAR=rebar3 make test ... module 'eimp_test' eimp_test: start_test...[0.007 s] ok eimp_test: is_supported_test...ok eimp_test: supported_formats_test...ok eimp_test: png_to_jpeg_test...[5.056 s] ok eimp_test: png_to_webp_test...[5.170 s] ok ... ```

Great!!

What tests really fail?

Let's see again at your test results:

module 'eimp_test'
  eimp_test: start_test...[0.039 s] ok
  eimp_test: is_supported_test...ok
  eimp_test: supported_formats_test...ok
  eimp_test: png_to_jpeg_test...[0.134 s] ok
  eimp_test: png_to_webp_test...[0.745 s] ok
  eimp_test: png_to_gif_test...*timed out*

This only shows that five tests passed and one failed. But what about the other ones?

Those tests are defined in src/eimp_test.erl, and are executed in the same order that the functions are defined. You can simply move the _test functions up and down, or remove them.

My question is:

Checks to clarify that: