riscv-software-src / homebrew-riscv

homebrew (macOS) packages for RISC-V toolchain
https://riscv.org
312 stars 50 forks source link

RISC-V cross compiler on m1 fails to build; kludgey workaround included #46 #47

Closed robertlipe closed 2 years ago

robertlipe commented 3 years ago

What you were trying to do (and why)

Enjoy glorious M1 hosted speed when building embedded RISC-V code. (Yes, I know that there are like ten of us in the world that care about such a crazy combination and the x86_64 hosted version works swimmingly well.)

This probably affects every cross compilation with this version and it a step that someone will have to work through to get "normal" GCC builds working.

Mostly, I'm trying to get this issue on record (google search) and to identify the right team to talk through a better fix. I totally get the rarity of cross compilation on Mac hosts and I'm lucky that the uncommon part - the RISC-V part - works great and this is "just" a configure issue that's tripping over an assumption that all Darwin is x64. I just need to find who the right team to help land a proper fix it.

I know that there's going to be an intricate series of steps involving M1 as a GCC(GNU toolchain) host (that's the issue here), GCC as an ARM8.3/M1 target to barf out the right opcodes, and GCC as an BigSur/M1 to handle OS calling conventions, system call interfaces, process startup and teardown. I'm pretty sure this is going to be a game of hot potato as there's whatever code that Homebrew is using, the FSF GCC, and whichever of the vendor branches (SiFive? Kendryte? Nucleisys?) is in play here. GCC's config files were also widely shared with other projects at some point, so it's not at all clear to me where the authoritative version of this file should be.

The good thing (for me, as I've been out of the GNU chain internals biz for ~20 years) is that this is an issue caused by the use of Precompiled Headers (don't care and probably irrelevant for cross anyway) and we can get away with just more thoroughly ignoring them as going the other way (making the startup code call the Darwin x64 code)

What happened (include command output)

Add tap "riscv/riscv"

Building GCC cross compiler fails on host-side What you expected to happen

Regale in glorious cross-compilation in full, hot, RISCV-on-ARM action glory.

The nice thing is that you can just comment out the code that configures the (non-working) stub that sets up the Makefile to include the host-darwin code that's calling the HOST_HOOK instead

Step-by-step reproduction instructions (by running brew install commands)

brew tap "riscv/riscv" time brew install riscv-gnu-toolchain --with-multilib. (I don't think multiline is involved. I just need that for my GCC target)

Now that you've burned 12 minutes of compute time (sigh) while configure looks for stdio.h for the trillionth time, patch ~/Library/Caches/Homebrew/riscv-gnu-toolchain--git/riscv-gcc/gcc/confg.host and just neutralize the code that would have added it.

 | i[34567]86-*-go32* \
 | vax-*-vms*)
    echo "*** Configuration for host ${host} not supported" 1>&2
    exit 1
    ;;
esac

# Common parts for widely ported systems.
case ${host} in
  *-darwin*)
    # Generic darwin host support.
    # out_host_hook_obj=host-darwin.o
    # #host_xmake_file="${host_xmake_file} x-darwin"
    echo "Whee"
    ;;
esac

case ${host} in
  aarch64*-*-freebsd* | aarch64*-*-linux* | aarch64*-*-fuchsia*)
    case ${target} in
      aarch64*-*-*)

This is wrong because it probably breaks PCH on native builds, but I think a native GCC for M1 is still some time away. This matches overly broadly, but this is one hammer that brings success.

So I think I've done better than a "help, is broken" analysis on all this, and I DID verify that this compiler successfully compiles and links my OS kernel https://github.com/robertlipe/riscv7 and that the debugger connects to my ICE. So lots of Really Hard stuff just worked. Were tripping over something relatively easy; I just need help finding the right people to talk this through.

Please help get to either accept a variation of this change (I'll work with you on making the test above more specific if you can help me identify the triplet) or to connect with the right right people to work out a fix upstream so that I can help fix this in some upstream repo and help get it downstreamed or cherry-picked into the Homebrew toolkit.

I <3 that I can do a cold build of my kernel in 1.6 seconds. :-)

Thanx for being awesome!

(Originally submitted as https://github.com/Homebrew/homebrew-core/issues/69846)

willmcpherson2 commented 3 years ago

Thanks for this. I'm also building a cross-compiler on M1 (albeit for a different project and different target architecture)

So to reiterate, you can remove lines 96-97 from gcc/config.host to disable PCH

    # out_host_hook_obj=host-darwin.o
    # host_xmake_file="${host_xmake_file} x-darwin"

And for googleability, here's the error I'm fixing:

Undefined symbols for architecture arm64:
  "_host_hooks", referenced from:
      gt_pch_save(__sFILE*) in libbackend.a(ggc-common.o)
      gt_pch_restore(__sFILE*) in libbackend.a(ggc-common.o)
      toplev::main(int, char**) in libbackend.a(toplev.o)
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
aswaterman commented 3 years ago

As a new M1 user myself, maybe I should chip into the homebrew-riscv effort for a change and turn @willmcpherson2's suggestion (thanks for that!) into a pull request.

willmcpherson2 commented 3 years ago

@aswaterman I'm not sure what the consequences of disabling PCH will be. Perhaps this could be a temporary measure under an architecture flag (e.g. aarch64-apple-darwin)

aswaterman commented 3 years ago

Yeah, I'm hoping it's just a compile runtime performance downgrade for the time being... hopefully it doesn't break stuff downstream, e.g., compiling libstdc++. But I assume you've already gotten that far.

In any case, I was just going to hack the homebrew scripts to sed the configure script for now. We can make the hacky sed script slightly more elaborate as you suggest if need be.

robertlipe commented 3 years ago

As for building libstdc++ (I know it was an example) remember that the full risc-v chain uses the generated GCC to build libstdc++ (and a bunch more) for the target, so it's somewhat self-exercising. While I feel a little bad about disabling this instead of tackling it full-on and figuring out, I don't think there are really issues of correctness here. This seems just a mismatch of the use of pch_save and pch_restore being compiled in, but not the definitions. The file with the definitions needed stronger Darwin mojo that I possessed to work it out as it tripped over the page size no longer being a constant, IIRC.

I've actually never seen a project that's considered PCH to be worth the effort an most of the programming world seems to be moving onto module as an alternative anyway That said, this is clearly meant to be an optional part of GCC; it's just getting tangled up with using host-specific logic, but for a host that's clearly not ready to be a target in GCC yet. Since we're "just" doing cross-compiles here, that doesn't bother us much.

I've successfully run and executed all my own code in this combination now. It works as well at my Intel builds did.

My read of your patch is that it probably disables PCH on Intel-based hosts, too. It's worth noting there's actually nothing RISC-V specific to this; this would impact any cross-chain. The ARM-ies would benefit from this, too.

My guess is that somewhere on GCC trunk there is (or will be) a better fix for this in the works. It may be worth watching upstream for that.

Anyway, thanx for taking my hack to the appropriate level to let others benefit from it!

cfriedt commented 3 years ago

I'm hitting this as well. CPU is pegged at 100% at

checking for compiler with PCH support...

Reproducible on x86_64 MacBookPro and arm64 MacBookAir

cfriedt commented 3 years ago

Rebuilding with --with-multilib and the change mentioned above has fixed the linker error for me mentioned in riscv/riscv-gnu-toolchain#836. My diff is attached.

riscv-gnu-toolchain-fix-hang-on-pch-search.patch.txt

cfriedt commented 3 years ago

Also needed to use --with-cmodel=medany, and I made a PR for adding that by default for --enable-multilib in #58

venkatakrishs commented 3 years ago

Hi, I am trying to build riscv-tools in Mac Mini M1 using homebrew while I get the toolchain installed. But when I try to compile any code I get the following error :

/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x280): undefined reference to `__adddf3'
/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x28c): undefined reference to `__truncdfsf2'
/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x29e): undefined reference to `__fixsfsi'
/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x2b2): undefined reference to `__floatsisf'
/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x2c2): undefined reference to `__subsf3'
/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x2dc): undefined reference to `__ltsf2'
/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x39e): undefined reference to `__ltsf2'
/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x3c6): undefined reference to `__eqsf2'
/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x3e0): undefined reference to `__mulsf3'
/opt/homebrew/Cellar/riscv-gnu-toolchain/master/lib/gcc/riscv64-unknown-elf/11.1.0/../../../../riscv64-unknown-elf/bin/ld: util.c:(.text+0x3f2): undefined reference to `__fixsfsi'
collect2: error: ld returned 1 exit status
make[2]: *** [hello.riscv] Error 1
make[1]: *** [finish] Error 2
make: *** [software] Error 2

When I tried to reinstall with --with-multilib and --with-cmodel=medany I have the below error shown to me.

Error: invalid option: --with-multilib

Actually I need to install the toolchain with rv64imac and not rv64imafdc. Is there a way to change this installation? I also think that this issue is due to the mismatch in the toolchain arch. Any suggestion on this?

sbeamer commented 3 years ago

rv64imac is not supported by default. I would set up your own build configuration from https://github.com/riscv/riscv-gnu-toolchain. There is a large number of possible ISA extension combinations, so only a few combinations are on by default.

robertlipe commented 3 years ago

rv64imac (aka rv64gc) is a default. Run

% riscv64-unknown-elf-gcc --print-multi-lib .; @.**@.=ilp32 @.**@.=ilp32 @.**@.=ilp32 @.**@.=ilp32 @.**@.=ilp32f @.**@.=lp64 @.**@.=lp64d

I did have to turn on the 32-bit versions because I work mostly with tiny microcontrollers, but rv64gc is, like, the most universal/useful and is pretty much the LCD if you're building anything that's running a "real" OS like FreeBSD or Linux or such. Actually hardware floating point might be a requirement, but it's not that difficult to emulate if you dopn't have it; it's just not speedy.

integer multiply atomic compressed

That's not at all an uncommon configuration, Scott. Is there perhaps a typo in your message?

RJL

On Tue, Aug 24, 2021 at 7:57 PM Scott Beamer @.***> wrote:

rv64imac is not supported by default. I would set up your own build configuration from https://github.com/riscv/riscv-gnu-toolchain. There is a large number of possible ISA extension combinations, so only a few combinations are on by default.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/riscv/homebrew-riscv/issues/47#issuecomment-905082034, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCSD35NGTWBBFBHX7XXYSDT6Q5X7ANCNFSM4WUZV4LQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

sbeamer commented 3 years ago

I stand corrected. I did not know rv64imac/lp64 was included in the current multilib recipes. The list of supported strings has changed over the years.

As a small note, rv64imac != rv64gc, as g implies f&d (as well as some nitpicky Z extensions).

I've been waiting forever for a multilib build to finish, and will be pushing bottles soon.

venkatakrishs commented 3 years ago

Is there any specific place I have to mention to compile it for rv64imac/lp64. I have done it for Linux(Ubuntu) but Mac Environment is new to me. Any changes in .rb?

robertlipe commented 3 years ago

SB> I stand corrected. I did not know rv64imac/lp64 was included in the

current multilib recipes. The list of supported strings has changed over the years.

Yes, these things have changed names over times.

As a small note, rv64imac != rv64gc, as g implies f&d (as well as some

nitpicky Z extensions).

I also stand corrected. I thought f was probably there. I forgot about F (because 'float' rarely makes sense over 'double' unless you're REALLY cache line limited) Still, that combination of offerings makes sense in a workstation-class configuration.

I've been waiting forever for a multilib build to finish, and will be pushing bottles soon.

Those multilib builds are indeed expensive. (I remember the M1 made quick work of them, though...) I used to be a gcc maintainer 25+ (?) years ago and was one of the first users of multilib. I could only test nightly builds every other day because they took more than 24 hours to build and run tests in that era.

Thanx for the update!

VS> Is there any specific place I have to mention to compile it for

rv64imac/lp64. I have done it for Linux(Ubuntu) but Mac Environment is new to me. Any changes in .rb?

As we discussed/discovered above, that's one of the default builds. Just set your compiler invocation (-march and -mabi) to call the precise combination you need so you're safe from those default changes that sneak in from time to time.

RJL

venkatakrishs commented 3 years ago

Hi @sbeamer @robertlipe,

I have created and mounted a case-sensitive file system(APFS Case Sensitive) and moved the riscv-gnu-toolchain directory to it. I have issued the command "./configure --prefix=/riscv-gnu-toolchain-installation/ --with-arch=rv64imac --with-abi=lp64 --with-cmodel=medany --disable-float --disable-atomic" I have this command executed correctly while when I start the compilation with the "make" command I get the following issue :

ld: warning: ignoring file ../libctf/.libs/libctf.a, building for macOS-arm64 but attempting to link with file built for macOS-arm64

Undefined symbols for architecture arm64:
  "_ctf_archive_iter", referenced from:
      _dump_bfd in objdump.o
  "_ctf_bfdopen_ctfsect", referenced from:
      _dump_bfd in objdump.o
  "_ctf_close", referenced from:
      _dump_bfd in objdump.o
  "_ctf_dict_close", referenced from:
      _dump_bfd in objdump.o
  "_ctf_dict_open", referenced from:
      _dump_bfd in objdump.o
  "_ctf_dump", referenced from:
      _dump_ctf_archive_member in objdump.o
  "_ctf_errmsg", referenced from:
      _dump_bfd in objdump.o
      _dump_ctf_errs in objdump.o
      _dump_ctf_archive_member in objdump.o
  "_ctf_errno", referenced from
      _dump_ctf_archive_member in objdump.o
  "_ctf_errwarning_next", referenced from:
      _dump_ctf_errs in objdump.o
  "_ctf_import", referenced from:
      _dump_ctf_archive_member in objdump.o
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[5]: *** [objdump] Error 1
make[4]: *** [all-recursive] Error 1
make[3]: *** [all] Error 2
make[2]: *** [all-binutils] Error 2
make[1]: *** [all] Error 2
make: *** [stamps/build-binutils-newlib] Error 2

I have also tried to add --enable-multilib

robertlipe commented 3 years ago

Isn't --target still needed?

Whatever you're showing is building native MacOS and not a cross targeting RISC-V.

On Fri, Aug 27, 2021, 6:38 AM Venkatakrishnan Sutharsan < @.***> wrote:

Hi @sbeamer https://github.com/sbeamer @robertlipe https://github.com/robertlipe,

I have created and mounted a case-sensitive file system(APFS Case Sensitive) and moved the riscv-gnu-toolchain directory to it. I have issued the command "./configure --prefix=/riscv-gnu-toolchain-installation/ --with-arch=rv64imac --with-abi=lp64 --with-cmodel=medany --disable-float --disable-atomic" I have this command executed correctly while when I start the compilation with the "make" command I get the following issue :

ld: warning: ignoring file ../libctf/.libs/libctf.a, building for macOS-arm64 but attempting to link with file built for macOS-arm64

Undefined symbols for architecture arm64: "_ctf_archive_iter", referenced from: _dump_bfd in objdump.o "_ctf_bfdopen_ctfsect", referenced from: _dump_bfd in objdump.o "_ctf_close", referenced from: _dump_bfd in objdump.o "_ctf_dict_close", referenced from: _dump_bfd in objdump.o "_ctf_dict_open", referenced from: _dump_bfd in objdump.o "_ctf_dump", referenced from: _dump_ctf_archive_member in objdump.o "_ctf_errmsg", referenced from: _dump_bfd in objdump.o _dump_ctf_errs in objdump.o _dump_ctf_archive_member in objdump.o "_ctf_errno", referenced from _dump_ctf_archive_member in objdump.o "_ctf_errwarning_next", referenced from: _dump_ctf_errs in objdump.o "_ctf_import", referenced from: _dump_ctf_archive_member in objdump.o ld: symbol(s) not found for architecture arm64 clang: error: linker command failed with exit code 1 (use -v to see invocation) make[5]: [objdump] Error 1 make[4]: [all-recursive] Error 1 make[3]: [all] Error 2 make[2]: [all-binutils] Error 2 make[1]: [all] Error 2 make: [stamps/build-binutils-newlib] Error 2

I have also tried to add --enable-multilib

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/homebrew-riscv/issues/47#issuecomment-907138500, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCSD34TLWW6JNIQGXVP4JLT652MPANCNFSM4WUZV4LQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

robertlipe commented 2 years ago

The original issue with GCC on M1 has long since been addressed by sbeamer and myself. As this PR has become a dumping ground for unrelated issues, I'm closing it. Thanx to all that have participated.