rrthomas / mmv

Other
50 stars 7 forks source link

[homebrew] regression test failure with 2.6 release build #19

Closed chenrui333 closed 5 months ago

chenrui333 commented 6 months ago

đź‘‹ trying to build the latest release, but run into some build issue. The error log is as below:

error test failure log ``` ==> /opt/homebrew/Cellar/mmv/2.6/bin/mmv -p a b 2>&1 sh: line 1: 7901 Segmentation fault: 11 /opt/homebrew/Cellar/mmv/2.6/bin/mmv -p a b 2>&1 Error: mmv: failed An exception occurred within a child process: Minitest::Assertion: Expected: 1 Actual: 139 ```

full build log, https://github.com/Homebrew/homebrew-core/actions/runs/7854840651/job/21436051656?pr=162273 relates to Homebrew/homebrew-core#162273

rrthomas commented 6 months ago

I see mmv segfaults. I will need more information to diagnose the problem, specifically a backtrace, or a way to reproduce the crash. I tried mimicking the test, and mmv didn't crash for me. I built mmv with ASAN and got no errors or warnings while running this test, and it gives the output expected by your test suite.

rrthomas commented 6 months ago

Random crashes in this sort of code often indicate a problem with the way symbols are overridden to point to libgc; this is sometimes triggered by a change to gnulib, so you should make sure you're using the gnulib sources as shipped with mmv (I didn't check if this is the case, I just remember that some packagers like to override supplied gnulib files). Relatedly, I noticed that some calls to free() had crept back in in the last few years, which should both be unnecessary, and could potentially cause problems, so I have removed them in master.

rrthomas commented 6 months ago

I thought about it a bit and removed the use of libgc: it's not very useful for mmv, as mmv's memory allocations are mostly not garbage until the program exits. If you can test the current git master with the brew build, that would be most helpful.

chenrui333 commented 6 months ago

@rrthomas in case it is helpful, this is the stack trace when running the failed test

$ lldb /opt/homebrew/Cellar/mmv/2.6/bin/mmv
(lldb) target create "/opt/homebrew/Cellar/mmv/2.6/bin/mmv"
Current executable set to '/opt/homebrew/Cellar/mmv/2.6/bin/mmv' (arm64).
(lldb) run -p a b
Process 84001 launched: '/opt/homebrew/Cellar/mmv/2.6/bin/mmv' (arm64)
Process 84001 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x27008800)
    frame #0: 0x0000000183b86ad0 libsystem_platform.dylib`_platform_strcmp + 64
libsystem_platform.dylib`:
->  0x183b86ad0 <+64>: ldr    q0, [x0], #0x10
    0x183b86ad4 <+68>: ldr    q1, [x1], #0x10
    0x183b86ad8 <+72>: cmeq.16b v1, v0, v1
    0x183b86adc <+76>: and.16b v0, v0, v1
Target 0: (mmv) stopped.
rrthomas commented 6 months ago

Sorry, I need a C traceback. I'm not even on an ARM platform!

chenrui333 commented 6 months ago

@rrthomas would you mind sharing some guidance of generating the c traceback?

Also I might need to update my build (currently I am using release config settings)

rrthomas commented 6 months ago

Sorry, a) that's not really mmv-specific, and b) I don't know your OS or toolchain. I strongly suggest you test with the latest git master first, because that may have fixed the problem anyway.

chenrui333 commented 6 months ago

yeah, I can give the latest head commit a try.

chenrui333 commented 6 months ago

I just tried the head commit, still the same issue.

chenrui333 commented 6 months ago

I have an idea, we should plugin the macos CI jobs to see if it happen in the same way

rrthomas commented 6 months ago

Good idea. Story of my life at the moment: mysterious crashes/test failures on macOS (which I don't use), while GitHub CI is happily passing…

cho-m commented 6 months ago

Which basename() is intended in https://github.com/rrthomas/mmv/commit/0ccbb1f8818fe7fbc44e876b261a30c908dbe2ee?

On macOS, Homebrew currently has to compile with -Wno-implicit-function-declaration due to some missing #include headers. If I remove that and manually add the obvious headers, I hit:

mmv.c:1760:22: error: call to undeclared function 'basename'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
                const char *name = basename(program_name);
                                   ^
mmv.c:1760:22: note: did you mean 'base_name'?
./lib/dirname.h:36:7: note: 'base_name' declared here
char *base_name (char const *file) _GL_ATTRIBUTE_MALLOC;
      ^
1 error generated.

I can't find much documentation on base_name() from gnulib but using that one seems to not segfault.


e.g. From 2.6 release using whatever the implicit basename() is on macOS:

(lldb) breakpoint set -f mmv.c -l 1762
Breakpoint 1: where = mmv`main + 1000 at mmv.c:1762:14, address = 0x0000000100002728

(lldb) run
Process 74252 launched: '/opt/homebrew/bin/mmv' (arm64)
Process 74252 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100002728 mmv`main(argc=4, argv=0x000000016fdfc7c8) at mmv.c:1762:14
   1759     else {
   1760         const char *name = basename(program_name);
   1761
-> 1762         if (strcmp(name, COPYNAME) == 0)
   1763             op = NORMCOPY;
   1764         else if (strcmp(name, APPENDNAME) == 0)
   1765             op = APPEND;
Target 0: (mmv) stopped.

(lldb) v name
(const char *) name = 0x0000000045808e00 ""

(lldb) continue
Process 74252 resuming
Process 74252 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x45808e00)
    frame #0: 0x0000000185c0aad0 libsystem_platform.dylib`_platform_strcmp + 64
libsystem_platform.dylib`:
->  0x185c0aad0 <+64>: ldr    q0, [x0], #0x10
    0x185c0aad4 <+68>: ldr    q1, [x1], #0x10
    0x185c0aad8 <+72>: cmeq.16b v1, v0, v1
    0x185c0aadc <+76>: and.16b v0, v0, v1
Target 0: (mmv) stopped.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x45808e00)
  * frame #0: 0x0000000185c0aad0 libsystem_platform.dylib`_platform_strcmp + 64
    frame #1: 0x0000000100002738 mmv`main(argc=4, argv=0x000000016fdfc7c8) at mmv.c:1762:7
    frame #2: 0x000000018585d0e0 dyld`start + 2360

e.g. Using base_name():

(lldb) breakpoint set -f mmv.c -l 1762
Breakpoint 1: where = mmv`main + 992 at mmv.c:1762:14, address = 0x00000001000025bc

(lldb) run
Process 77486 launched: '/opt/homebrew/bin/mmv' (arm64)
Process 77486 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x00000001000025bc mmv`main(argc=4, argv=0x000000016fdfc7c8) at mmv.c:1762:14
   1759     else {
   1760         const char *name = base_name(program_name);
   1761
-> 1762         if (strcmp(name, COPYNAME) == 0)
   1763             op = NORMCOPY;
   1764         else if (strcmp(name, APPENDNAME) == 0)
   1765             op = APPEND;
Target 0: (mmv) stopped.

(lldb) v name
(const char *) name = 0x0000600003c50000 "mmv"

(lldb) continue
Process 77486 resuming
a -> b : old b would have to be deleted.
Nothing done.
Process 77486 exited with status = 1 (0x00000001)
rrthomas commented 6 months ago

Good catch! Of course, using the dirname gnulib module, I meant base_name. I'll fix this now.

rrthomas commented 6 months ago

Good idea. Story of my life at the moment: mysterious crashes/test failures on macOS (which I don't use), while GitHub CI is happily passing…

D'oh, this project does not currently have a GitHub CI build.

rrthomas commented 5 months ago

Closing this issue, as there is now a working GitHub CI build for macOS.