Closed tklauser closed 4 years ago
Change https://golang.org/cl/133735 mentions this issue: unix: add support for linux/riscv64
/cc myself as Fedora/RISC-V maintainer.
I would be happy to incl. this into the distribution (once it compiles) and do some testing.
Thanks for raising this.
Does anyone know where I can get hardware to test on ?
New house, CPU and fgpa efforts are going on using risc-v. Quite a turning point and opportunity for golang programmers.
Here is the main news feed on all of it: https://riscv.org/news/
@gedw99 At the moment SiFive is the only company that has produced a RISC-V dev board (HiFive Unleashed). One caveat to this board is the price, which is around $1,000 USD.
cc @ganshun and @rjoleary the three of us were also about to start looking at a port. We got a couple of SiFive boards to try it out on.
@hugelgupf, when you're far enough along to want a builder set up, file a bug and copy me and I'd be more than happy to help.
I recently starting using qemu to run a riscv fedora build. Here are the links in case it's useful for devving without a board.
https://github.com/riscv/riscv-qemu https://fedoraproject.org/wiki/Architectures/RISC-V/Installing#Download_the_latest_disk_image https://fedoraproject.org/wiki/Architectures/RISC-V/Installing#Boot_under_qemu
Using dnf install gccgo
I've been able to build and run go apps in emulation.
Just reminder that RISC-V is supported in released versions of QEMU and libvirt, which makes it easy to setup multiple VMs with management. See: https://fedoraproject.org/wiki/Architectures/RISC-V/Installing#Boot_with_libvirt
Note, that our libffi does not support Go closures thus some functionality does not work. The patch exist already, but I need to backport it to our libffi version. Should I bump priority on this?
FWIW, I started rebasing the Go 1.18-based https://github.com/riscv/go-riscv to current Go tip at https://github.com/tklauser/go-riscv. It currently builds on linux/amd64
but cannot generate valid riscv64
binaries yet.
I currently lack the time and resources to really push it forward but maybe someone else working on it might find it useful.
Just saw that you can apparently run RISC-V in the cloud via an FPGA AWS instance:
That might be a possible temporary builder strategy.
@bradfitz https://fires.im/ is a risc-v deployment on aws fpgas but the underlying risc-v implementation only runs up to something like 100mhz. Working with a team that's doing some research in that area and it seems to be the norm. So far emulation has been the most performant by far.
QEMU supports up to 8 cores (MTTCG - multi-threaded TCG), which is great if you can compile in parallel.
Alternatively contact Palmer Dabbelt from SiFive and ask for a free SiFive HiFive Unleashed board (multiple projects have received one for porting efforts).
Change https://golang.org/cl/157899 mentions this issue: unix: use Renameat2 to implement Renameat on linux/riscv64
Change https://golang.org/cl/157900 mentions this issue: unix: use int8 for RawSockaddrUnix.Path on linux/riscv64
Change https://golang.org/cl/157901 mentions this issue: unix: add assembly for riscv64 syscalls
Issue for setting up a qemu-based RISC-V builder: #30262
Change https://golang.org/cl/170298 mentions this issue: cpu: add basic support for GOARCH=riscv64
The riscv qemu port works pretty well and Fedora has been making strides making it a normal distro. I am keeping tabs on the state of riscv here: https://github.com/marcopeereboom/riscv-bringup-doco and maybe it has some helpful nuggets for some folks. The point is that qemu makes a fine development target and you really don't need metal to make progress.
The last missing piece of the puzzle for me is go. https://github.com/riscv/riscv-go seems to work but is 1.8 and pretty much everything I work on has moved on to 1.12 and uses modules. I looked at porting the 1.8 codebase up a release at a time and it doesn't look terribly hard but I don't have time to sit there and grind at it (familiar refrain).
Excited to see this move forward.
A buddy of mine made Go 1.12 work. He forked https://github.com/riscv/riscv-go and updated it to go 1.12. It seems to produce binaries that work but undoubtedly there are still some bugs lurking.
It seems to be able to build binaries but when it is cross-compiled it does not quite work yet.
See the repo here: https://github.com/4a6f656c/riscv-go
It would be great if some of this can start being backported to master for more widespread testing.
A buddy of mine made Go 1.12 work.
Yay!
Did he also pull in the as-yet-unlanded changes from Stefan O'Rear?
It seems to be able to build binaries but when it is cross-compiled it does not quite work yet.
I haven't looked at the code, but it would be good to run the compiler under the race detector. (That is, run on amd64 with -race, building for risc-v.) The obj asm backends have to be concurrent as of 1.9.
It would be great if some of this can start being backported to master for more widespread testing.
I assume you mean upstreamed. We're in a code freeze now. See https://github.com/golang/go/wiki/Go-Release-Cycle. Ideally adding risc-v to master would happen right at the beginning of the cycle, starting around Aug 1.
Note that we can't accept code unless all the authors have signed a CLA (which is part of why I haven't looked through it yet).
Joel is ex google and has made several commits to Go. Pretty sure there won't be an issue with that.
Didn't know we were in code freeze but that is actually pretty good. That way we can get the last bits worked out and see if we can get it in as soon as it unfreezes and then we have an entire release cycle to get it right. Exciting!
Did he also pull in the as-yet-unlanded changes from Stefan O'Rear?
Got a link?
Joel is ex google and has made several commits to Go. Pretty sure there won't be an issue with that.
I should have checked out his GitHub profile. Nice to see his name again. :)
That way we can get the last bits worked out and see if we can get it in as soon as it unfreezes and then we have an entire release cycle to get it right.
Yes, indeed.
In an ideal world, too, we'd break up the upstreaming into a series of commits, to aid in reviewing. It's worth checking out how other completed architectures got upstreamed (e.g. arm64, wasm).
Got a link?
A few of us with an interest in making this happen are hanging out in #risc-v on Gophers Slack. We'd love to have more folks join us there, and to pitch in.
I'm happy to be involved, but I am intentionally not a Slack user. If y'all start using some other medium as well (github issues, a mailing list), please let me know.
I deleted slack on all my devices.
We certainly could use some help with the remaining bugs. It looks like some linux syscalls fail or are not implemented. These bugs should not be super hard to hound down and fix. We can even use GitHub to track bugs 😏
Little update. Bootstrapping now works reliably and one can build go itself in QEMU riscv. There seems to be either a bug with locks or with QEMU causing some fun crashes and we are looking into those. If anyone has the skills to help we'd love to hear from you.
Basic steps are on a machine with go1.4 or a new enough go installed:
GOOS=linux GOARCH=riscv ./bootstrap.sh
Copy the tbz file to the riscv QEMU host, untar it and then build go itself:
GOGC=off GOROOT_^COTSTRAP=~/build/go-linux-riscv-bootstrap/ ./all.bash
Note: all compilation was done from this repo: https://github.com/4a6f656c/riscv-go
It doesn't always work and may have to be restarted but it'll eventually build. I tried building some larger apps and they act the same. Work for a bit and then kind of randomly crash. It certainly looks like races.
Without using GOGC=off
I get the following error:
# ./make.bash
Building Go cmd/dist using /root/go-linux-riscv-bootstrap/.
# _/root/riscv-go/src/cmd/dist
cmd/dist/test.go:1263:9: internal compiler error: '(*tester).hasSwig': panic during lowered cse while compiling (*tester).hasSwig:
runtime error: index out of range
goroutine 9 [running]:
cmd/compile/internal/ssa.Compile.func1(0x210198d118, 0x210134f1e0)
/Users/cdepaula/repos/go-linux-riscv-bootstrap/src/cmd/compile/internal/ssa/compile.go:45 +0xc0
panic(0xc20c00, 0x14821a0)
/Users/cdepaula/repos/go-linux-riscv-bootstrap/src/runtime/panic.go:522 +0x254
cmd/compile/internal/ssa.cse(0x210134f1e0)
/Users/cdepaula/repos/go-linux-riscv-bootstrap/src/cmd/compile/internal/ssa/cse.go:116 +0x26d4
cmd/compile/internal/ssa.Compile(0x210134f1e0)
/Users/cdepaula/repos/go-linux-riscv-bootstrap/src/cmd/compile/internal/ssa/compile.go:90 +0x67c
cmd/compile/internal/gc.buildssa(0x21005509a0, 0x3, 0x0)
/Users/cdepaula/repos/go-linux-riscv-bootstrap/src/cmd/compile/internal/gc/ssa.go:233 +0x9d4
cmd/compile/internal/gc.compileSSA(0x21005509a0, 0x3)
/Users/cdepaula/repos/go-linux-riscv-bootstrap/src/cmd/compile/internal/gc/pgen.go:299 +0x40
cmd/compile/internal/gc.compileFunctions.func2(0x2100ea3080, 0x21010b2970, 0x3)
/Users/cdepaula/repos/go-linux-riscv-bootstrap/src/cmd/compile/internal/gc/pgen.go:364 +0x50
created by cmd/compile/internal/gc.compileFunctions
/Users/cdepaula/repos/go-linux-riscv-bootstrap/src/cmd/compile/internal/gc/pgen.go:362 +0x160
I've built the bootstrap on Mac with Go 1.12.5.
After building it successfully (never failed) I ran the tests and got some problems. Will paste here as a follow-up to document:
Running on Qemu 4.0 on Mac.
# uname -a
Linux fedora-riscv 5.1.0-rc7-00005-g83a50840e72a #2 SMP Mon Apr 29 19:07:37 -03 2019 riscv64 riscv64 riscv64 GNU/Linux
Here is the output:
Now will try to look into building some test applications and into these errors.
I edited your comment to put the long log into a "Details" block.
fatal error: cas1
runtime: panic before malloc heap initialized
runtime stack:
runtime.throw(0x187542, 0x4)
/root/riscv-go/src/runtime/panic.go:617 +0x88 fp=0x3fffdd1030 sp=0x3fffdd1008 pc=0x4aaf8
runtime.check()
/root/riscv-go/src/runtime/runtime1.go:215 +0x57c fp=0x3fffdd1070 sp=0x3fffdd1030 pc=0x603ac
runtime.rt0_go(0x3fffdd1098, 0xd3868, 0x4, 0x3fffdd1392, 0x3fffdd13ba, 0x3fffdd13f4, 0x3fffdd1405, 0x0, 0x3fffdd1418, 0x3fffdd1428, ...)
/root/riscv-go/src/runtime/asm_riscv.s:54 +0x90 fp=0x3fffdd1078 sp=0x3fffdd1070 pc=0x82ad0
This failure, which appears elsewhere as well, shows quite clearly that there is something wrong with the implementation of atomic.Cas
. As that could easily cause race conditions elsewhere, you need to fix that first.
Following up, for most, maybe all, targets, atomic.Cas
is implemented in the compiler itself.
Perhaps https://review.gerrithub.io/c/riscv/riscv-go/+/353663 may help. (And if it does, it is definitely looking at the other as-yet-unmerged CLs in the list I linked to above.)
First of all, awesome! Second, does the port supports the 32-bit RISC-V variant?
I have a 32-bit linux capable hardware synthesized on an FPGA which I want to try golang on.
No this does not support riscv32.
Joel and I have narrowed this down to locking in QEMU. We have written several smaller test programs to reproduce the bug easily. I am going to write the locking/atomic issues up in the QEMU repo.
If someone has real hardware to test on that would help to see if this is QEMU or not.
When we fix these issues go might be in pretty good shape actually.
I edited your comment to put the long log into a "Details" block.
fatal error: cas1 runtime: panic before malloc heap initialized runtime stack: runtime.throw(0x187542, 0x4) /root/riscv-go/src/runtime/panic.go:617 +0x88 fp=0x3fffdd1030 sp=0x3fffdd1008 pc=0x4aaf8 runtime.check() /root/riscv-go/src/runtime/runtime1.go:215 +0x57c fp=0x3fffdd1070 sp=0x3fffdd1030 pc=0x603ac runtime.rt0_go(0x3fffdd1098, 0xd3868, 0x4, 0x3fffdd1392, 0x3fffdd13ba, 0x3fffdd13f4, 0x3fffdd1405, 0x0, 0x3fffdd1418, 0x3fffdd1428, ...) /root/riscv-go/src/runtime/asm_riscv.s:54 +0x90 fp=0x3fffdd1078 sp=0x3fffdd1070 pc=0x82ad0
This failure, which appears elsewhere as well, shows quite clearly that there is something wrong with the implementation of
atomic.Cas
. As that could easily cause race conditions elsewhere, you need to fix that first.
This is exactly right.
@marcopeereboom let me know if there are workarounds while building on Qemu and where to follow these reports. I'm about to get a SiFive Unleashed board soon so we will be able to build on real iron to remove these variables.
I have a riscv64 board but not much time, if you give me a shell script I'll run it for you (on Alpine Linux).
Nice work Joel (@4a6f656c I assume?) and @marcopeereboom!
One small note: the reserved GOARCH
value for 64-bit RISCV is riscv64
. riscv
is reserved for 32-bit RISCV, see https://golang.org/cl/106256
I took the liberty to fork Joel's tree and did the rename (plus some other cleanup fixes). You can find the tree at https://github.com/tklauser/riscv-go/commits/riscv64dev. Unfortunately I don't seem to be able to send a PR against Joel's fork, but please feel free to pull any patches you deem useful.
I'll also try to update my riscv branch on https://github.com/tklauser/go/commits/riscv (which tracks gotip) with the changes from your branch and possibly https://review.gerrithub.io/c/riscv/riscv-go/+/353663 plus further patches form this series.
Hey @tklauser Joel had pushed the rename already. We certainly will take your help and if you have additional PRs on top of the current tree let us know. And thanks for doing a current branch, that will prove useful.
I have a riscv64 board but not much time, if you give me a shell script I'll run it for you (on Alpine Linux).
I pretty much pasted the instructions to build it. Building go, unlike userspace and kernels, is trivial.
I merged @tklauser patches into the fork I did from @4a6f656c . Had to adjust some conflicts and also hammer in a change in a function where the dynamic generation used an unavailable syscall. The files are in my fork as a temporary test (https://github.com/carlosedp/riscv-go). I'm currently running the tests and apparently see fewer errors (@josharian those CAS errors are gone).
Will post the test run soon.
And after some more playing with the test code we are convinced that it is either QEMU or the kernel. I am leaning towards the atomic code being preempted and the CPU losing state somehow. I wrote a C threaded app that uses a CAS that is pretty similar to Go's and made it into a poor man's mutex. The critical section just increments a global and at the end of the run the expected value is compared against the actual value.
The results are that when run on linux inside QEMU with enough threads and/or iterations it misses a couple of locks. When run with qemu-riscv without a kernel it runs indefinitely without failure. Code for this test: https://gist.github.com/marcopeereboom/1a3b62a89f81b2ed341082cedaef4874
More worrisome is that some gcc atomics also fail in a similar test. Code: https://gist.github.com/marcopeereboom/1e40d4baffdcc9a2066310d770f5ac12
We did try adding some extra fences but that had no effect. Code: https://gist.github.com/marcopeereboom/4357c59b57dc998a58d37817d59f99b0
The disassembly looks correct and when held against the spec it seems like it should work.
I know these snippets won't win the state pageant but they are small enough for quick tests. I am debating where to record these issues. Anyone has any opinion? Linux kernel, QEMU?
Attempted to build your go stuff. Bootstrap went well, but it failed early when doing the full build:
https://paste.sr.ht/~sircmpwn/3a76f2ee0b914eccf2cc70fc6ca6b7a75de547a3
Your smaller test code:
hifive:/tmp$ ./a.out
in thread 10
in thread 11
in thread 12
in thread 27
in thread 28
in thread 14
in thread 23
in thread 24
in thread 26
in thread 25
in thread 19
in thread 13
in thread 15
in thread 29
in thread 20
in thread 17
in thread 18
in thread 22
in thread 21
in thread 16
ok 2000000 2000000
Note, I'm using musl libc for both bootstrapping and on riscv64.
It sometimes works. May have to play with the iterations and threads a bit on your hardware.
Spent a few minutes tweaking it and haven't gotten it to fail yet. If the issue is between kernel and qemu, kernel's looking pretty safe.
This is what I see. This explains the go issues too.
[root@stage4 ~]# ./a.out
in thread 10
in thread 11
in thread 12
in thread 13
in thread 18
in thread 15
in thread 16
in thread 17
in thread 19
in thread 14
in thread 20
in thread 21
in thread 23
in thread 22
in thread 26
in thread 27
in thread 29
in thread 24
in thread 28
in thread 25
womp womp 19999976 20000000
Can you share what qemu and linux kernel you are running?
This is what I see compiling the test code:
[root@fedora-riscv ~]# gcc --version
gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1)
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[root@fedora-riscv ~]#
[root@fedora-riscv ~]# gcc test1.c
test1.c: In function ‘main’:
test1.c:85:41: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
85 | pthread_create(&tid[i], NULL, thread, (void *)i+10);
| ^
/usr/bin/ld: /tmp/cc1fyhzx.o: in function `main':
test1.c:(.text+0x14e): undefined reference to `pthread_create'
/usr/bin/ld: test1.c:(.text+0x188): undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status
I'm running Fedora nightly with Kernel 5.1.0-rc7-00005-g83a50840e72a.
Here are the results of the test run for Go after the merges:
See the link here:
https://paste.sr.ht/~sircmpwn/3a76f2ee0b914eccf2cc70fc6ca6b7a75de547a3
I'm on a real riscv64 board, HiFive Unleashed. Kernel is 4.20-rc4 plus this patch:
https://github.com/esmil/linux/commit/870b04e85662a02ee1f6333e1d037c774ed4350e
And this config:
https://paste.sr.ht/~sircmpwn/88c5f57715b904896c5cbe57c5d6674f738d0e51
@carlosedp Run gcc -pthread test1.c
.
This is what I see compiling the test code:
[root@fedora-riscv ~]# gcc --version gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1) Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [root@fedora-riscv ~]# [root@fedora-riscv ~]# gcc test1.c test1.c: In function ‘main’: test1.c:85:41: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 85 | pthread_create(&tid[i], NULL, thread, (void *)i+10); | ^ /usr/bin/ld: /tmp/cc1fyhzx.o: in function `main': test1.c:(.text+0x14e): undefined reference to `pthread_create' /usr/bin/ld: test1.c:(.text+0x188): undefined reference to `pthread_join' collect2: error: ld returned 1 exit status
I'm running Fedora nightly with Kernel 5.1.0-rc7-00005-g83a50840e72a.
Here are the results of the test run for Go after the merges:
Do something like: gcc x.c -pthread
. The warning is because of a terrible hack.
This issue serves to track the port to the RISC-V architecture. There is an out-of-tree port at https://github.com/riscv/riscv-go based on Go 1.8 which - according to riscv/riscv-go#19 - is no longer maintained and would need quite some work to be updated to the current Go tip.
Also see https://golang.org/cl/106256#message-2d9a5c5b89ad55b8b7999f794983f993649232c8 and https://groups.google.com/forum/#!searchin/golang-dev/RISC%7Csort:date/golang-dev/VpsyGdi-sQQ/FMu6IB_2CwAJ where @josharian summarized the current state of the existing port.
The
GOARCH
valuesriscv
andriscv64
were reserved in https://golang.org/cl/106256. These values are already used by gccgo. Additional changes were made todebug/elf
(https://golang.org/cl/107339),cmd/cgo
(https://golang.org/cl/110066),cmd/dist
andcmd/types
(https://golang.org/cl/118618) in order be able to generate type definition files in thex/sys/unix
package (https://golang.org/cl/133735)./cc @bradfitz @ianlancetaylor @josharian