Closed fspoettel closed 1 year ago
This would be great! It'd also be nice if there was a way of adding the benchmarks to the README stars table in some way π€.
Another thing along these lines that I added to my repo created from this template (thank you for this by the way!) was a way to test the solution, not just the example. So that if I go refactoring and optimizing I don't accidentally break my implementation for the real input. I had to leave it ignored so it didn't break CI which doesn't have the real inputs downloaded, but it's something. Anyway, I thought I'd send it your way just to get your thoughts on it since I saw you had some RFCs up :)
It'd also be nice if there was a way of adding the benchmarks to the README stars table in some way π€.
Cool idea! π Not sure if we should try extend the table or setup a different block. Maybe (a tweaked version of) the output of cargo all --release
in a code block with some sysinfo? E.g:
{some sys_info}
Day 01
-------
Parser: β avg. time: 35.68Β΅s / 16298 samples
Part 1: avg. time: 5.00ns / 4000000 samples
Part 2: avg. time: 5.10Β΅s / 176475 samples
Total: 40.79Β΅s
Day 02
-------
Not solved.
...
CI which doesn't have the real inputs downloaded
I'm a bit on the fence about that gitignore - I understand where it is coming from, but in reality most people check in the inputs anyway.
It'd also be nice if there was a way of adding the benchmarks to the README stars table in some way π€.
Cool idea! π Not sure if we should try extend the table or setup a different block. Maybe (a tweaked version of) the output of
cargo all --release
in a code block with some sysinfo? E.g:
The problem with adding this to the stars table is that the current template does not verify that we generate the correct result and the CI hook that adds the stars is probably not the best part to benchmark the solution.
It should be possible to add a function to the solve program or even add a submit program that automatically submits the result to the adventofcode webpage. I think I have seen a python template that does that. That submit program could than run the benchmark, submit the result and update the readme.
CI which doesn't have the real inputs downloaded
I'm a bit on the fence about that gitignore - I understand where it is coming from, but in reality most people check in the inputs anyway.
I won't argue with anyone that wants to commit the inputs themselfs but if the developer of AoC specificly asked that we don't publish the inputs, maybe a template isn't the best place to ignore that.
Cool idea! π Not sure if we should try extend the table or setup a different block. Maybe (a tweaked version of) the output of
cargo all --release
in a code block with some sysinfo? E.g:Benchmarks
{some sys_info} Β Β Day 01 ------- Parser: β avg. time: 35.68Β΅s / 16298 samples Part 1: avg. time: 5.00ns / 4000000 samples Part 2: avg. time: 5.10Β΅s / 176475 samples Β Total: 40.79Β΅s Β Day 02 ------- Not solved. Β ... Β
Yeah this would be great!
CI which doesn't have the real inputs downloaded
I'm a bit on the fence about that gitignore - I understand where it is coming from, but in reality most people check in the inputs anyway.
If the dev of AoC asked for them not to be checked in, I don't want my stuff checking it in. Another option would be giving the aoc-cli a way to pull the session token from an env var instead of a file so that it could then be stored in Secrets, at which point CI could just run cargo download
itself before running tests. Might be nice to also add a --all
flag to download
as well.
The problem with adding this to the stars table is that the current template does not verify that we generate the correct result and the CI hook that adds the stars is probably not the best part to benchmark the solution.
It should be possible to add a function to the solve program or even add a submit program that automatically submits the result to the adventofcode webpage. I think I have seen a python template that does that. That submit program could than run the benchmark, submit the result and update the readme.
Oh yeah, I'm perfectly happy with it not being part of CI, but instead being done locally. So long as these things are opt-in via a parameter, at least. Something like cargo solve 04 --benchmark --submit
with defaults for both being false?
add a submit program that automatically submits the result to the adventofcode webpage.
The crate we use for downloading inputs supports input submissions, so we could leverage it here. I would accept a PR for that if someone is interested and created #19 for this.
cargo solve 04 --benchmark --submit
This looks like a good way forward. π I have a somewhat working implementation of benchmarks on my repo for this year - I'll try and put it behind a --bench
flag and open a PR with the changes if any of you wants to test / review.
give the aoc-cli a way to pull the session token from an env var
I think we could implement this on our side with the --session-file
argument to aoc-cli and reuse the AOC_SESSION
we already use for the table. The only annoyance here is that it might not be present which makes the CI a bit more complex.
edit: forgot to mention that my preference for benchmarking is that it should be done client-side. :)
I am currently learning rust, thats how I found this repo. I already started to work on a "simple" solution for this using criterion. Right now I have a PoC for criterion working, running my solution for day 1. https://github.com/fspoettel/advent-of-code-rust/pull/20 Next step would be reading the benchmark results and updating the readme. After that I might also look into automatic submission.
I think we could implement this on our side with the
--session-file
argument to aoc-cli and reuse theAOC_SESSION
we already use for the table. The only annoyance here is that it might not be present which makes the CI a bit more complex.
I was just looking at the aoc-cli repo to see about taking a crack at implementing #19 and saw this PR adding the ability for aoc-cli to use an env var for the session was just opened. So that may be easier to do soon :)
I implemented a proof of concept for readme benchmarks on my 2022 repo to get a better idea how this would work with the current code layout.
The code is still unfinished and a bit terrible, but benchmarks are written to the readme automatically when cargo all --release
is executed. The (non-scientific) benchmarks work by executing the code between 10 and 100.000 times depending on how long the initial execution took in order to try and stay below a second per solution part. Not too sure about the output format, I may round this to milliseconds eventually, should be an easy enough change.
I also updated the solve
command-line output to use this benchmark method - see following video for how this looks:
https://user-images.githubusercontent.com/1682504/205697329-fec0da10-71e0-4045-888c-b08a0e509c7b.mov
Implementation
Currently experimenting with a higher sample count for benching on my solution repo, might add to template after this year concludes. Unfortunately, using
criterion
with binary crates is not possible yet, would make this a bit simpler.Advantages:
Disadvantages:
Copy
(not an issue so far).Example output: