Open lazyprogrammerio opened 4 months ago
Current state of building and running ETH execution clients on RISC-V:
Current state of building and running ETH consensus clients on RISC-V:
For Prysm -> Bazel issue created: https://github.com/bazelbuild/bazel/issues/23018 For Lodestar -> https://github.com/riscv-forks/electron-riscv-releases/issues/1
Lodestar: After deciding to not install the binaries for the Electron dependency to see how far I can go, I reached the big blocker, which is classic-level (a LevelDb wrapper) - no support for RISC-V in the assembly code.
PR was created to see if it fixes the issue: https://github.com/Level/classic-level/pull/94.
One note here, Lodestar depends on a classic-level, which depends on leveldb. Leveldb version used on classic-level seems to be 7 years old (if I am not wrong): https://github.com/google/leveldb/commits/v1.20, see https://github.com/Level/classic-level/commits/main/deps/leveldb.
Nimbus - DOES NOT WORK: gcc: error: ‘-march=native’: ISA string must begin with rv32 or rv64
Probably easy to fix and get a step further by adding rv64 target
Grandine - DOES NOT WORK: error: failed to run custom build command for ring v0.16.20
Seems to fail due to missing arch info in that crate.
Erigon - DOES NOT WORK: github.com/prysmaticlabs/gohashtree@v0.0.3-alpha.0.20230502123415-aafd8b3ca202/hash.go:77:5: undefined: supportedCPU. From how it looks like, the supported amd64 and arm64 arches have the code written in assembly, which is going to be quite a challenge for RISC-V: https://github.com/prysmaticlabs/gohashtree/blob/main/hash_arm64.s
With @haurog's fix, here is a Dockerfile to build Nimbus beacon/validator:
FROM alpine:edge
RUN apk update
RUN apk add nim
RUN nim --version
RUN apk update && apk add --no-cache make gcc musl-dev linux-headers git bash git-lfs
WORKDIR /usr/src
RUN bash -c "git clone --recurse-submodules -j8 https://github.com/lazyprogrammerio/nimbus-eth2 nimbus-eth2"
RUN bash -c "cd nimbus-eth2 && make USE_SYSTEM_NIM=1 -j$(nproc) update"
RUN bash -c "cd nimbus-eth2 && make USE_SYSTEM_NIM=1 -j$(nproc) nimbus_beacon_node nimbus_validator_client"
Output:
Build completed successfully: build/nimbus_validator_client
Build completed successfully: build/nimbus_beacon_node
Docker file to build Nimbus for eth-docker
# Build Nimbus in a stock alpine container
FROM nimbus/devel:stage1 AS builder
# nimbus/devel:stage1 is the above comment Dockerfile built image
# Included here to avoid build-time complaints
ARG DOCKER_TAG
ARG DOCKER_VC_TAG
ARG DOCKER_REPO
ARG DOCKER_VC_REPO
ARG BUILD_TARGET
ARG SRC_REPO
# Pull all binaries into a second stage deploy debian container
FROM alpine:edge AS consensus
ARG USER=user
ARG UID=10002
RUN apk update && apk add ca-certificates bash tzdata git curl
RUN apk update && apk add --no-cache make gcc musl-dev linux-headers git bash git-lfs
# See https://stackoverflow.com/a/55757473/12429735RUN
RUN adduser \
--disabled-password \
--gecos "" \
--home "/nonexistent" \
--shell "/usr/sbin/nologin" \
--no-create-home \
--uid "${UID}" \
"${USER}"
RUN mkdir -p /var/lib/nimbus/ee-secret && chown -R ${USER}:${USER} /var/lib/nimbus && chmod 700 /var/lib/nimbus && chmod 777 /var/lib/nimbus/ee-secret
# Cannot assume buildkit, hence no chmod
COPY --from=builder --chown=${USER}:${USER} /usr/src/nimbus-eth2/build/nimbus_beacon_node /usr/local/bin/
COPY --chown=${USER}:${USER} ./docker-entrypoint.sh /usr/local/bin/
COPY --chown=${USER}:${USER} ./validator-exit.sh /usr/local/bin/
# Belt and suspenders
RUN chmod -R 755 /usr/local/bin/*
USER ${USER}
ENTRYPOINT ["nimbus_beacon_node"]
FROM alpine:edge AS validator
ARG USER=user
ARG UID=10000
RUN apk update && apk add --no-cache make gcc musl-dev linux-headers git bash git-lfs
# See https://stackoverflow.com/a/55757473/12429735RUN
RUN adduser \
--disabled-password \
--gecos "" \
--home "/nonexistent" \
--shell "/usr/sbin/nologin" \
--no-create-home \
--uid "${UID}" \
"${USER}"
RUN mkdir -p /var/lib/nimbus/ee-secret && chown -R ${USER}:${USER} /var/lib/nimbus && chmod 700 /var/lib/nimbus && chmod 777 /var/lib/nimbus/ee-secret
# Cannot assume buildkit, hence no chmod
COPY --from=builder --chown=${USER}:${USER} /usr/src/nimbus-eth2/build/nimbus_beacon_node /usr/local/bin/
COPY --chown=${USER}:${USER} ./docker-entrypoint.sh /usr/local/bin/
COPY --chown=${USER}:${USER} ./validator-exit.sh /usr/local/bin/
# Belt and suspenders
RUN chmod -R 755 /usr/local/bin/*
USER ${USER}
ENTRYPOINT ["nimbus_beacon_node"]
FROM alpine:edge AS validator
ARG USER=user
ARG UID=10000
RUN apk update && apk add --no-cache make gcc musl-dev linux-headers git bash git-lfs
# See https://stackoverflow.com/a/55757473/12429735RUN
RUN adduser \
--disabled-password \
--gecos "" \
--home "/nonexistent" \
--shell "/sbin/nologin" \
--no-create-home \
--uid "${UID}" \
"${USER}"
# Create data mount point with permissions
RUN mkdir -p /var/lib/nimbus && chown -R ${USER}:${USER} /var/lib/nimbus && chmod -R 700 /var/lib/nimbus
# Cannot assume buildkit, hence no chmod
COPY --from=builder --chown=${USER}:${USER} /usr/src/nimbus-eth2/build/nimbus_validator_client /usr/local/bin/
COPY --chown=${USER}:${USER} ./docker-entrypoint-vc.sh /usr/local/bin/
# Belt and suspenders
RUN chmod -R 755 /usr/local/bin/*
USER ${USER}
ENTRYPOINT ["nimbus_validator_client"]
I've adjusted Dockerfile.source for Nimbus to build on alpine:3. Test that on your RISC-V machine, if you would.
Can you also get me output of uname -a
on that machine, please. That should allow me to adjust ./ethd config
so it offers a Nimbus/Geth combo on Risc-V
Went for it by looking for riscv
. Try an ./ethd config
and see how it behaves for you, please.
You'd need to adjust NIM_SRC_REPO
and/or NIM_SRC_BUILD_TARGET
manually in .env
until that build fix makes it into a release
Here is the output from uname -a
Linux 5.10.113+ #1 SMP PREEMPT Thu Apr 25 13:17:48 UTC 2024 riscv64 riscv64 riscv64 GNU/Linux
Thanks. I could adjust the grep to riscv64 as other riscv architectures won’t work with existing docker images. But, no rush. Let’s see whether it works at all, first
Checking available products. Milk-V Jupiter seems likely with NVMe and 16 GiB RAM. Ditto Sifive HiFive Unmatched Rev. B
Still hard mode compared to Odroid H4 (Ultra), but doable.
LicheePi has no NVMe from what I can see, not a good choice.
That pretty much is our conclusion as well. The boards we have access to at the moment are rather to see if we actually get anything running. Maybe there is a chance to split consensus and execution to two boards, but it still is gonna be difficult. Unfortunately the Jupiter is not widely available at the moment and there are only some preorder units that have been sent out. The development of new CPUs and boards seem to be very fast and in a year or so we might be in a very different position. Ideally we would have clients ready to be used by then.
Did a pull request for nimbus: https://github.com/status-im/nimbus-eth2/pull/6439
If accepted and merged to the stable branch we will be able to directly build from their repo. The build script automatically checks if it is being built on a risc-v board and we do not have to change any build parameters.
Nice! Does ethd config detect riscv and offer only nimbus and Geth, in your testing?
I think @lazyprogrammerio does all the configurations in the .env file directly. As far as I know, nothing has yet been changed in ethd config.
ethd config already detects riscv and acts accordingly. It offers nimbus and Geth and sets Dockerfile.source. It doesn’t change the source repo as that’s maybe not necessary once your pr has been accepted.
that code hasn’t been tested as I don’t have a riscv
Ah, I see totally forgot that you implemented that already. Thanks for reminding me. @lazyprogrammerio have you tested it? Otherwise I will test it tomorrow.
@yorickdowne, I just tested it. The config works. dockerfile.source is set, but accidentally you set it for nethermind (NM) instead of Nimbus (NIM). After the config finishes it fails with the following error:
Total reclaimed space: 11.34kB
[+] Pulling 14/15
✔ execution Skipped 0.0s
✔ validator Skipped 0.0s
✔ consensus Skipped 0.0s
✔ grafana Skipped 0.0s
✔ mev-boost Skipped 0.0s
✔ validator-keys Skipped 0.0s
✔ prometheus Skipped 0.0s
✘ blackbox-exporter Error 1.0s
✘ node-exporter Error 1.0s
✘ promtail Error 1.0s
✘ validator-exit Error 1.0s
⠋ cadvisor Pulling 1.0s
✘ loki Error 1.0s
✘ json-exporter Error 1.0s
✘ ethereum-metrics-exporter Error 1.0s
no matching manifest for linux/riscv64 in the manifest list entries
./ethd terminated with exit code 18 on line 20
This happened during ./ethd config
I guess there are no docker entries for riscv64 for most needed images.
Got it thanks, I’ll fix that!
Yes indeed. Arm64 is rare, riscv64 is not a thing. The clients will need to be source compiled locally until / unless some teams start publishing riscv64 images
I tested some more execution client builds on riscv. I follow the docs from each project to build the client locally.
BESU: builds, but when running is missing a library 'ckzg4844jni' might be: https://github.com/ethereum/c-kzg-4844. Needs further investigation.
Nethermind: no dotnet available on device. Dotnet has been built for riscv, might need to install manually.
Reth: builds, but fails starting:
2024-07-23T10:31:20.493075Z INFO Opening database path="/home/haurog3389/.local/share/reth/mainnet/db"
2024-07-23T10:31:20.599685Z ERROR shutting down due to error
Error: failed to open the database: unknown error code (12)
Might be due to disk space limitations. Needs to be tested again with an actual ssd.
To conclude all the build tests: geth and nimbus are working. These 2 are perfect for the inital tests as they still are the most resource efficient clients. Besu and Reth might become useable with some modifications. Nethermind needs further investigation into how to run dotnet on riscv. Erigon is most probably a no-go due to assemby language dependencies. Lighthouse, teku and lodestar need additional investigation and maybe some fixes to get the running. Prysm might be the most difficult one as the build tools do not support riscv.
While trying to get to the bottom of the lighthouse build failure I found someone from flashbots to try to build clients on risc-v: https://github.com/RustCrypto/utils/issues/1087
I will try to contact them.
Did a pull request for the failing library in the lighthouse build. Lets hope this will fix the build: https://github.com/sigp/ethereum_hashing/pull/8
Lighthouse builds locally with a lot of patching and upgrading dependencies. Will have to see what the best course of action is to get these changes into lighthouse and its dependencies.
BESU: builds, but when running is missing a library 'ckzg4844jni' might be: https://github.com/ethereum/c-kzg-4844. Needs further investigation.
Got a board coming from aliexpress.
I think to get besu working will just be a matter of locally building https://github.com/Consensys/jc-kzg-4844/ on a risc-v system, and using that in the besu build. That lib currently only publishes packages for x86_64 and arm64, but I suspect it should build fine on an armbian risc-v system.
That would use java-native for things like secp256k1, but that should get the ball rolling (albeit slowly until we get besu-native support for risc-v).
Linking this document here, if anyone needs it in the future. It compiles all the knowledge gathered in the last few weeks related to execution/consensus usage / support / hacks, board kernels and gotchas/quirks, OS support and more: https://github.com/lazyprogrammerio/eth-docker-docs/blob/main/website/docs/Usage/OtherArches.md
I wrote an issue and a first pull request to lighthouse to make them compatible with RISC-V: https://github.com/sigp/lighthouse/issues/6297
It will be a few more steps to get cpufeatures
and libp2p
ready for RISC-V. This is just the start.
That would use java-native for things like secp256k1, but that should get the ball rolling (albeit slowly until we get besu-native support for risc-v).
Hi Gary, this would apply to Teku as well, right? (Getting "Teku failed to start: BLS native library unavailable for this platform
" there)
(Getting "
Teku failed to start: BLS native library unavailable for this platform
" there)
Not sure if this is the same issue @lazyprogrammerio mentioned when they tried it: https://github.com/eth-educators/eth-docker/issues/1873#issuecomment-2230076446
Linking this document here, if anyone needs it in the future. It compiles all the knowledge gathered in the last few weeks related to execution/consensus usage / support / hacks, board kernels and gotchas/quirks, OS support and more: https://github.com/lazyprogrammerio/eth-docker-docs/blob/main/website/docs/Usage/OtherArches.md
Do you want to offer that as a PR to the docs?
Current state of building and running ETH consensus clients on RISC-V:
- teku - DOES NOT WORK. building works. Running does not work because of upstream issues with https://github.com/supranational/blst [FIXED by building liblst.so manually] and ROCKSDB https://github.com/facebook/rocksdb [to be investigated]
Rocksdb compiles on Banana Pi F3 but Teku needs some additional compilation config for rebuilding it with native Rocksdb support;
Teku failed to start: java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni8413112201487393251.so: /tmp/librocksdbjni8413112201487393251.so: cannot open shared object file: No such file or directory (Possible cause: can't load AMD 64 .so on a RISCV64 platform)
Asked in Teku Discord but no response so far.
- prysm - DOES NOT WORK. building cannot be started as bazel/bazelisk https://github.com/bazelbuild/bazelisk does not support RISC-V. To be investigated: https://bazel.build/install/compile-source#bootstrap-bazel
Compiling directy with go build
results in BLST issues:
env GO111MODULE=on go build -o ./build/bin/beacon-chain ./cmd/beacon-chain
# github.com/prysmaticlabs/gohashtree
../go/pkg/mod/github.com/prysmaticlabs/gohashtree@v0.0.4-beta.0.20240624100937-73632381301b/hash.go:46:5: undefined: supportedCPU
../go/pkg/mod/github.com/prysmaticlabs/gohashtree@v0.0.4-beta.0.20240624100937-73632381301b/hash.go:56:5: undefined: supportedCPU
../go/pkg/mod/github.com/prysmaticlabs/gohashtree@v0.0.4-beta.0.20240624100937-73632381301b/hash.go:86:5: undefined: supportedCPU
# github.com/prysmaticlabs/prysm/v5/crypto/bls
crypto/bls/bls.go:19:14: undefined: blst.SecretKeyFromBytes
crypto/bls/bls.go:24:14: undefined: blst.PublicKeyFromBytes
crypto/bls/bls.go:31:14: undefined: blst.SignatureFromBytesNoValidation
crypto/bls/bls.go:36:14: undefined: blst.SignatureFromBytes
crypto/bls/bls.go:41:14: undefined: blst.MultipleSignaturesFromBytes
crypto/bls/bls.go:46:14: undefined: blst.AggregatePublicKeys
crypto/bls/bls.go:51:14: undefined: blst.AggregateMultiplePubkeys
crypto/bls/bls.go:56:14: undefined: blst.AggregateSignatures
crypto/bls/bls.go:61:14: undefined: blst.AggregateCompressedSignatures
crypto/bls/bls.go:66:14: undefined: blst.VerifySignature
crypto/bls/bls.go:66:14: too many errors
The new nimbus release (24.8.0) can now be built on RISC-V out of the box. They included my PR. We can now build nimbus directly from stable releases.
I did a PR in the libp2p library to make it compatible with risc-v: https://github.com/libp2p/rust-libp2p/issues/5590
Thanks everybody, this is lovely work. People start to notice and contribute, this is a very powerful idea.
I've been learning and playing around:
https://github.com/a16z/helios/issues/370 https://github.com/ethereum/trin/issues/1444 https://github.com/flashbots/mev-boost/issues/681
Also, I want to use this jenkins project for nightly tests to make sure we don't regress: https://dash.cloud-v.co/view/all/job/flashbots/
Update on hardware:
Milk-V is likely the best choice if you can get it, as the CPU is metal-enclosed and easy to cool
These do not sync mainnet yet, the CPU is too slow. Hardware coming out in 2025 should bring it to parity or near-parity with ARM64 boards like the RK3588 ones.
Thanks everybody, this is lovely work. People start to notice and contribute, this is a very powerful idea.
I've been learning and playing around:
a16z/helios#370 ethereum/trin#1444 flashbots/mev-boost#681
Also, I want to use this jenkins project for nightly tests to make sure we don't regress: https://dash.cloud-v.co/view/all/job/flashbots/
@come-maiz Thanks for the work, I see that some of the PRs are already merged.
There was a DevCon presentation on the topic of RISC-V too: https://app.devcon.org/schedule/J3SWYT The feedback at DevCon was very positive, and there is ongoing work for more consensus/execution clients to be supported.
Currently, only geth offers RISC-V images: https://registry.hub.docker.com/r/ethereum/client-go Lighthouse is working with a few small libraries upgrades off-tree, in the process of getting them merged and have official images hopefully.
Thanks alot!
Please share the video of the presentation once it's out :)
Please share the video of the presentation once it's out :)
Hello, the presentation video is here at https://app.devcon.org/schedule/J3SWYT. Thanks!
As the upstream software RISC-V support has come a long way, there is the real possibility of a dockerized approach to Ethereum staking on RISC-V boards like Sifive, Milk-V, LicheePI.