Closed yvan-sraka closed 10 months ago
I wanted to measure the time latency improvement of such a hack, so I wrote a dumb python script:
import os
import timeit
DEV_SHELLS = [
"ghc8107",
"ghc902",
"ghc925",
"ghc8107-minimal",
"ghc902-minimal",
"ghc925-minimal",
"ghc8107-static-minimal",
"ghc902-static-minimal",
"ghc925-static-minimal",
]
T = {}
flake = "input-output-hk/devx" # vs. yvan-sraka/static-closure
for devShell in DEV_SHELLS:
os.system(f"nix-collect-garbage -d")
x = lambda number: round(timeit.timeit(lambda: os.system(
f'nix develop "github:{flake}#{devShell}"\
--no-write-lock-file --refresh --command true'
), number=number), 2)
T[devShell] = {"bootstrap": x(1), "reload": x(10)}
print(T)
I currently exhaustively list working version of the input-output-hk/devx
devShell, see issues #23 and #24. My machine (an iMac 24" M1) uses zw3rk.com cache (but I disable remote builder) as I wanted to match what I imagine the “defaults” user setting.
n.b. I blindly choose Python because there is maybe a future where I want to possibly perform some basics statistics with
numpy
or display graphics withmatplotlib
!
The hack lives in this flake.nix
shellHook
.
I'll post the benchmark result I got in this thread :)
This is good looking forward to the benchmarks!
The awaited benchmarks that ran past night on my machine (values unit is seconds):
Without the speed download hack (meaning the actual nix develop github:input-output-hk/devx#$key
flake):
ghc8107:
bootstrap: 2155.54
reload: 4.57
ghc902: broken
ghc925:
bootstrap: 2005.21
reload: 4.43
ghc8107-minimal:
bootstrap: 1820.64
reload: 3.59
ghc902-minimal:
bootstrap: 1788.87
reload: 3.33
ghc925-minimal:
bootstrap: 1826.22
reload: 3.76
ghc8107-static-minimal:
bootstrap: 1774.84
reload: 3.46
ghc902-static-minimal:
bootstrap: 1780.89
reload: 4.75
ghc925-static-minimal:
bootstrap: 1871.04
reload: 3.4
With the speed download hack of version 70e3884
(that could be summed up as: curl https://s3.zw3rk.com/devx/$arch.$key.zstd | zstd -d | nix-store --import
and then the env
trick. It does cache the download, but unconditionally does the nix-store import
for each reload…):
ghc8107: broken
ghc902: broken
ghc925: broken
ghc8107-minimal:
bootstrap: 95.71
reload: 37.74
ghc902-minimal:
bootstrap: 100.58
reload: 37.6
ghc925-minimal:
bootstrap: 101.05
reload: 33.25
ghc8107-static-minimal:
bootstrap: 82.32
reload: 30.61
ghc902-static-minimal:
bootstrap: 91.1
reload: 31.1
ghc925-static-minimal:
bootstrap: 88.54
reload: 27.21
First, there are few settings that are “broken” and I should investigate why … Then, as you can see, it's a big improvement in bootstrap speed (I have a quite slow internet connection so that surely helps to increase the numbers) …
… but there is more work to do, as @angerman made me realize: wrapper derivation should not have to rely on minio-client
! And reload time here is bad (the measure is 10x re-entering the shell): I should fix that, so it behaves at least like the “without the speed download hack” nix develop
and even I believe I can potentially shave those numbers a bit. :)
@yvan-sraka can you please update the comment above with the following remarks:
nix develop github:input-output-hk/devx#$key
, and the second is effective curl https://s3.zw3rk.com/devx/$arch.$key.zstd | zstd -d | nix-store --import
; and then the env
. Though you do cache the download through fetchurl in nix, but unconditionally do the import for each run, (and reload).If we had a canary derivation (e.g. the root of the imported closure), we could validate the existence of the closure in the store, by checking for the existence of that file; and skip the import?
@yvan-sraka can you please update the comment above with the following remarks:
Edited :)
If we had a canary derivation (e.g. the root of the imported closure), we could validate the existence of the closure in the store, by checking for the existence of that file; and skip the import?
Yes! That's precisely what I've in mind and implemented in a new flake version that should also have fixed the broken builds. I should indeed re-run benchmark against this new flake version, which is currently --impure
.
On aarch64-darwin
, the flavors ghc8107
and ghc902
are failing because of:
@nix { "action": "setPhase", "phase": "unpackPhase" }
unpacking sources
unpacking source archive /nix/store/9pqv84n4fxaadafjx32wi4c7d044xb0z-hlint-3.5-src
source root is hlint-3.5-src
@nix { "action": "setPhase", "phase": "patchPhase" }
patching sources
@nix { "action": "setPhase", "phase": "updateAutotoolsGnuConfigScriptsPhase" }
updateAutotoolsGnuConfigScriptsPhase
@nix { "action": "setPhase", "phase": "configurePhase" }
configuring
Configure flags:
--prefix=/nix/store/rvb3z44kwnwni719lndy9qz2dp84qxmw-hlint-exe-hlint-3.5 exe:hlint --package-db=clear --package-db=/nix/store/bvjs7g3g5i10h7pl360kpsbbh39m9y2s-hlint-exe-hlint-3.5-config/lib/ghc-9.0.2/package.conf.d --flags=ghc-lib --flags=gpl --flags=-hsyaml --flags=threaded --exact-configuration --dependency=hlint=hlint-3.5-DVvFAeGfGhl4cGfHK851Zv --dependency=array=array-0.5.4.0 --dependency=base=base-4.15.1.0 --dependency=deepseq=deepseq-1.4.5.0 --dependency=ghc-bignum=ghc-bignum-1.1 --dependency=ghc-boot-th=ghc-boot-th-9.0.2 --dependency=ghc-prim=ghc-prim-0.7.0 --dependency=integer-gmp=integer-gmp-1.1 --dependency=pretty=pretty-1.1.3.6 --dependency=rts=rts --dependency=template-haskell=template-haskell-2.17.0.0 --with-ghc=ghc --with-ghc-pkg=ghc-pkg --with-hsc2hs=hsc2hs --with-gcc=cc --with-ld=ld --with-ar=ar --with-strip=strip --disable-executable-stripping --disable-library-stripping --disable-library-profiling --disable-profiling --enable-static --enable-shared --disable-coverage --enable-library-for-ghci --datadir=/nix/store/44xd104h90xxkjbvg6sdriq471mpzir2-hlint-exe-hlint-3.5-data/share/ghc-9.0.2 --ghc-option=-fPIC --gcc-option=-fPIC
Configuring executable 'hlint' for hlint-3.5..
@nix { "action": "setPhase", "phase": "buildPhase" }
building
Preprocessing executable 'hlint' for hlint-3.5..
Building executable 'hlint' for hlint-3.5..
[1 of 1] Compiling Main ( src/Main.hs, dist/build/hlint/hlint-tmp/Main.o )
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
Linking dist/build/hlint/hlint ...
/nix/store/48py6zrawzim9ghrnkqwm36jl4j1l23x-clang-wrapper-11.1.0/bin/ld: line 256: 26817 Segmentation fault: 11 /nix/store/5wvlj00dr22ivh210b18ccv1i60h6c1q-cctools-binutils-darwin-949.0.1/bin/ld ${extraBefore+"${extraBefore[@]}"} ${params+"${params[@]}"} ${extraAfter+"${extraAfter[@]}"}
clang-11: error: linker command failed with exit code 139 (use -v to see invocation)
`clang' failed in phase `Linker'. (Exit code: 139)
I will re-run benchmarks in a GitHub Action context to have some consistency, since my personal internet connection is right now not enough reliable to not false results …
I don't think speed is the primary issue, as long as it's consistent. E.g. if you always get the same speed reliably that's going to provide good numbers. And most users won't be having 1G or 10G lines, but some XXX Mbit most likely.
If we find out that for fast lines, it's even worse though, that would also be good to know.
I don't think speed is the primary issue, as long as it's consistent. E.g. if you always get the same speed reliably that's going to provide good numbers. And most users won't be having 1G or 10G lines, but some XXX Mbit most likely.
If we find out that for fast lines, it's even worse though, that would also be good to know.
Yes! My current connection issues are effectively more about consistency than speed :)
I've run new benchmarks on @hamishmack fetch-docker.sh
in order to integrate them to the engineering blog post with this new script:
#! /usr/bin/env python
import os
def Dockerfile(shell, fast):
if fast:
cmd = f'./fetch-docker.sh input-output-hk/devx x86_64-linux.{shell}-env | zstd -d | nix-store --import'
else:
cmd = f'nix develop "github:input-output-hk/devx#{shell}" --command true'
return f"""FROM nixos/nix
RUN nix-channel --update
RUN echo "experimental-features = nix-command flakes" >> /etc/nix/nix.conf
RUN echo "accept-flake-config = true" >> /etc/nix/nix.conf
RUN ln -s $(which bash) /bin/bash
RUN nix profile install "nixpkgs#jq" "nixpkgs#zstd"
RUN curl -L https://raw.githubusercontent.com/input-output-hk/actions/latest/devx/support/fetch-docker.sh -o fetch-docker.sh
RUN chmod +x fetch-docker.sh
RUN time {cmd}"""
for shell in ["ghc8107-iog", "ghc962-iog", "ghc8107-static-minimal", "ghc962-static-minimal"]:
for fast in [True, False]:
with open('Dockerfile', 'w') as file:
file.write(Dockerfile(shell, fast))
os.system(f'docker build . | tee {shell}-{"fast" if fast else "slow"}.log')
… and I retrieve the results with:
#! /usr/bin/env bash
for file in *.log; do
grep '^real' "$file" | awk -v file="$file" '{print file " -> " $2}'
done
… I run it on my legacy ThinkPad X230 on a French countryside internet connection, to display how it changes loading times in context where internet connection and computing power are precious resources, like in a heavy CI:
ghc8107-iog-fast.log -> 8m28.296s
ghc8107-iog-slow.log -> 11m28.658s
ghc8107-static-minimal-fast.log -> 2m35.712s
ghc8107-static-minimal-slow.log -> 5m22.060s
ghc962-iog-fast.log -> 7m26.853s
ghc962-iog-slow.log -> 11m9.047s
ghc962-static-minimal-fast.log -> 1m35.069s
ghc962-static-minimal-slow.log -> 4m48.812s
… should I run these benchmarks with other devx
closure flavors or in other execution environments?
So it's about 50% of the total time. That's good! Thanks for running the benchmarks!
When using GHA to turn
iohk/devx
into a shell to run, e.g.cabal update, cabal build
, we take a lot of time downloading stuff.A lot of this time comes down to nix sequentially downloading a lot of data. We should build the store and export/import it instead to speed this up.
… will give us something like
2.5G
. That's a lot.We can also enter the shell using
(e.g. instead of
nix develop
).We could pre-build the closure (e.g. from
result
), and store that as azstd
compressed archive:And then re-import this as the first step in GHAs after setting up nix.
See for example this GHA: https://github.com/angerman/x/blob/c559ae0429bb69829a9c9cae8c21ab777461aaf2/.github/workflows/main.yml#L23-L66, which doesn't work properly yet (nix still ends up downloading stuff when trying to enter the shell; maybe this can be eliminated with the
env.sh
idea from above).