sinanmohd / evanix

Nix build Scheduler
https://git.sinanmohd.com/evanix
GNU General Public License v3.0
7 stars 1 forks source link

[01] Eval/IO overhead in the greedy demo #10

Open SomeoneSerge opened 3 months ago

SomeoneSerge commented 3 months ago

Running evanix on https://github.com/ggerganov/llama.cpp takes, in the --dry-run mode, about 20s, whereas running nix-eval-jobs takes about 6s. Per @sinanmohd's hypothesis, this is due to querying the substituters about the cached store paths too much (does nix not maintain a negative response cache for these queries?). The performance needs to be brought down to the same order of magnitude as nix-eval-jobs so that we can stop talking about the greedy PoC

sinanmohd commented 3 months ago

the 6s nix-eval-jobs eval is probably because of only evaluating #packages.x86_64-linux nix-eval-jobs --flake .#packages.x86_64-linux

you can get a similar performance from evanix with evanix --dry-run --flake .#packages.x86_64-linux

where as --system x86_64-linux still evaluates everything in #packages and drops derivations with mismatched system

sinanmohd commented 3 months ago

i tried running evanix & nix-eval-jobs on https://github.com/ggerganov/llama.cpp

$ time evanix --dry-run --system x86_64-linux --flake --max-build 555 --pipelined=false --solver-report .#packages

real    0m15.625s
user    0m0.001s
sys     0m0.004s

$ time nix-eval-jobs --force-recurse --flake .#packages

real    0m15.877s
user    0m7.297s
sys     0m3.474s

the time taken should not vary that much

SomeoneSerge commented 3 months ago

Running on Nixpkgs:

❯ git rev-parse HEAD
504ca45f5c2342f36c33312be19921077f38679e
❯ time nix run github:sinanmohd/evanix -- --system x86_64-linux --flake .#legacyPackages.x86_64-linux --dry-run --max-build 50000 > evanix.stdout 2>evanix.stderr
real    263m52.738s
user    0m7.048s
sys     0m17.988s

Nix-eval-jobs finishes in just over 3 minutes:

❯ nix develop github:sinanmohd/evanix --command -- time nix-eval-jobs --flake .#legacyPackages.x86_64-linux >nix-eval-jobs.stdout 2>nix-eval-jobs.stderr
❯ tail nix-eval-jobs.stderr  -n2
200.85 user 88.72 system 9:02.31 elapsed 53% CPU (0avgtext+0avgdata 4434872maxresident)k
447568inputs+253408outputs (1major+12985415minor)pagefaults 0swaps
sinanmohd commented 3 months ago

master without locality check (all drvs need to built)

$ time evanix --flake .#legacyPackages.x86_64-linux --dry-run --max-build 2147483647
real    17m9.733s
user    14m35.086s
sys     0m1.427s

nix-eval-jobs


$ time nix-eval-jobs --flake .#legacyPackages.x86_64-linux
real    5m20.245s
user    3m11.060s
sys     1m11.604s
sinanmohd commented 2 months ago

master without locality check (all drvs need to built)

$ time evanix --flake .#legacyPackages.x86_64-linux --dry-run --max-build 2147483647

real    5m35.859s
user    0m2.344s
sys     0m0.923s
SomeoneSerge commented 2 months ago

without locality check (all drvs need to built)

How do you turn this mode on/off? WHen you turn this on, does any selection algorithm run at all (e.g. the way I imagine this is you'd compute the full closure, and generate an all-1s "value" vector)?

sinanmohd commented 2 months ago

the solver is still in use here, i used to have a patch on git stash now you can use --cache-status=false on master

SomeoneSerge commented 2 months ago

the solver is still in use here, i used to have a patch on git stash

👍🏻 but what are the inputs to the solver?

sinanmohd commented 2 months ago

yes, if you mean cost vector by value. cache locality check sets cost to 0 for substituter cached derivatives. but we entirely remove locally cached dependencies from the DAG.

SomeoneSerge commented 2 months ago

So when you say "without locality check (all drvs need to built)" you mean cost=(1, 1, ..., 1) without zeros?

SomeoneSerge commented 2 months ago

Btw, could you open an issue upstream (nix-eval-jobs) about --check-cache-status and include a reproducer, demonstrating the overhead present on the second and further runs? This might even be an issue with CppNix...

Mic92 commented 2 months ago

https://nix.dev/manual/nix/2.23/command-ref/conf-file.html#conf-narinfo-cache-negative-ttl I would expect this setting would be used for caching cache status. You can check 'nix-build -vvvvv --dry-run' to debug in nix if this is the case.

sinanmohd commented 2 months ago

So when you say "without locality check (all drvs need to built)" you mean cost=(1, 1, ..., 1) without zeros?

yes

sinanmohd commented 2 months ago

yes, if you mean cost vector by "value vector". It's as if we have all-1s cost vector. cache locality check sets cost to 0 for subtitutotr cached derivatives. but we entirely remove locally cached deprivations from the DAG.

19-Jul-2024 14:01:52 Someone @.***>:

the solver is still in use here, i used to have a patch on git stash

👍🏻 but what are the inputs to the solver?

— Reply to this email directly, view it on GitHub[https://github.com/sinanmohd/evanix/issues/10#issuecomment-2238655110], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AQTXJ6PBGJJXQRNOG3XPHJLZNDFEZAVCNFSM6AAAAABKRDH4PSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZYGY2TKMJRGA]. You are receiving this because you were mentioned. [Tracking image][https://github.com/notifications/beacon/AQTXJ6NPSZABRPEYEVYUTPLZNDFEZA5CNFSM6AAAAABKRDH4PSWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUFN4VIM.gif]