eproxus / meck

A mocking library for Erlang
http://eproxus.github.io/meck
Apache License 2.0
813 stars 231 forks source link

Build failed on mips and sh4 at Debian #105

Closed niwamatsu closed 7 years ago

niwamatsu commented 11 years ago

Hi,

meck build failed on mips and sh4 at Debian.

Build log following; mips: https://buildd.debian.org/status/fetch.php?pkg=erlang-meck&arch=mips&ver=0.7.2-2&stamp=1349831619 sh4: http://buildd.debian-ports.org/status/fetch.php?pkg=erlang-meck&arch=sh4&ver=0.7.2-2&stamp=1349841585

These architecture seems to be an error in the same part.


meck_tests: sequence_multi_...[0.128 s] ok
meck_tests: loop_...[0.051 s] ok
meck_tests: loop_multi_...[0.154 s] ok
meck_tests: call_original_test...[0.125 s] ok
meck_tests: unload_renamed_original_test...[0.064 s] ok
meck_tests: unload_all_test...[0.325 s] ok
meck_tests: original_no_file_test...[0.089 s] ok
meck_tests: original_has_no_object_code_test...[0.047 s] ok
meck_tests: passthrough_nonexisting_module_test...[0.172 s] ok
meck_tests: passthrough_test...[0.130 s] ok
meck_tests: passthrough_different_arg_test...[0.091 s] ok
meck_tests: passthrough_bif_test...[2.473 s] ok
meck_tests: cover_test...*timed out*

undefined

Failed: 0. Skipped: 0. Passed: 64.

Would you give me the advice for correcting about this problem? If it is an error by timeout, Would you teach me how to extend the time of timeout?

Thanks, Nobuhiro

eproxus commented 11 years ago

If you have access to the machines (or architectures) in question, you could try to debug the error. The test case can be found here: https://github.com/eproxus/meck/blob/develop/test/meck_tests.erl#L922

My suggestion would be to run the test case in isolation first, to see if it still is a problem:

rebar eunit tests=cover_test

The next step would be to enable tracing on the functions called in the test, to see where it hangs. You can add this to the top of the test case:

cover_test() ->
    dbg:tracer(), dbg:p(all, call),
    [dbg:tpl(cover, F, x)|| F <- [compile, analyze]],
    [dbg:tpl(meck_test_module, F, x) || F <- [a, b, c]],
    [dbg:tpl(meck, F, x) || F <- [new, expect, unload]],
    dbg:tpl(filelib, is_file, x),
    ...

Then just run the test case again, preferably using the tests=... argument to Rebar to avoid tracing from the rest of the tests.

eproxus commented 11 years ago

Closing until further updates.

lemenkov commented 7 years ago

Actually I have some news here. It seems that the issue is still there, and I can reproduce it quite easily on one machine I have access to (PPC64 achitecture). Interesting thing is that the issue is rather floating - take a look at the failure arch matrix from two different build attempts:

I'll post update in a few days.

eproxus commented 7 years ago

@lemenkov Is it the same error? I can't seem to find the build log for those new builds...

lemenkov commented 7 years ago

@eproxus I believe it's almost the same (unfortunately failed build logs are getting disposed rather fast in Fedora Project). "Almost", because I see some difference - I've got an io premature shutdown failure (very similar to this one - https://stackoverflow.com/questions/6491897/erlang-spawn-problems/6499774#6499774 ) instead of getting timeout.

lemenkov commented 7 years ago

I'll post new logs shortly.

eproxus commented 7 years ago

Would like new logs since the last ones are four year old 😄 A lot happened in Meck after that (including changing build system to Rebar 3)

lemenkov commented 7 years ago

Sorry for being late. Expect a fix soon.

lemenkov commented 7 years ago

Yes, this cannot be reproduced with Rebar3, so this issue is related to Rebar2. I cannot find a specific reason for behaving like that, but finally found a fix. Apparently Rebar2 ion these arches doesn't like when you do meck:new(...) followed by meck:unload(). It must be moved into a fixture (foreach).

This is a Rebar2-specific issue, and I cannot reproduce it with a Rebar3. Also I tested patched version with Rebar3 on my PPC64 - it works. And since it doesn't hurt any Rebar3 user so why not to apply it? :)

eproxus commented 7 years ago

What is the actual error? Just timeouts?

lemenkov commented 7 years ago

@eproxus that's the tricky question. And I still can't answer.

I wasn't able to pinpoint an exact issue, but it seems that on some architectures (can't say Big-Endian), rebar can run tests in the way they can interfere with each other. I believe this indeed has something with timeouts. All the trests are running fine if running test-suites separately.

What I've changed is a very little thing. I've reorganized test-cases in two test-suites so they now have setup/teardown functions where all meck:new(...) / meck:unload() happens. So no meck:new(...) / meck:unload() can happen in the middle of other tests (?) or in any other unwanted time. This fixes the issue.

To sum up:

eproxus commented 7 years ago

Yeah, this whole thing seems very messy. Fortunately, the refactoring doesn't really change anything and in fact the suites contain less repetition 😄

Thanks for doing the research, I'll merge it. Let me know how it works!