near / wasmtime

Standalone JIT-style runtime for WebAssembly, using Cranelift
https://wasmtime.dev/
Apache License 2.0
3 stars 4 forks source link

Should we use snapshots in tests #188

Closed MCJOHN974 closed 7 months ago

MCJOHN974 commented 9 months ago

Moved from one PR review, @akashin said:

I think we should re-consider this design decision and explore what our other options are: [Current] Generate WAT files from WAST files and run them through WAT -> ZKASM pipeline Generate ZKASM files directly from WAST files and don't commit WAT files to Git Don't commit ZKASM files to Git and instead pass generated ZKASM code directly to JS test runner We need to understand what are the pros and cons of each approach and see which one is the most appropriate for our goals.

I suggest discuss in this issue if we should migrate from snapshot pattern.

My first opinion:

Generally, I see here tradeoff between commit some data explicitly or generate-check-drop in CI. Even if we will have 10 times more tests, du of zkasm data will be 70Mb, which is I think acceptable, so I think there should be some other reasons to hide some data.

MCJOHN974 commented 9 months ago

One new cons of snapshot testing from PR discussion (thanks @ akashin and @ mooori):

aborg-dev commented 9 months ago

Thank you for starting the discussion. I agree with the points that you mentioned so far.

I think the main challenge with generated files is the amount of noise they generate in the pull requests. Take for example a recent stack handling change. It touches over 12k lines of code and 734 files. Out of those, only 5 files are essential changes to Rust with less than 100 lines changed. The rest are generated files. And as we add more spec tests, we will only have more of those. The concrete downsides that I see are:

Now, I'll admit that I've proposed this approach with snapshots in the first place and it clearly doesn't scale well with the increasing number of WAT/ZKASM files in the codebase. So it is time to revise it and see if we can come up with a better solution.

MCJOHN974 commented 9 months ago

GitHub UI is not well suited for such big PRs and is sluggish to use

Hm, I'm using just google chrome without plugins and have button "hide all files in directory", which help focus on human-written files during the review, as long as we keep generated files in one directory.

Reviewing those changes is also challenging, I don't really expect the reviewer to look over all 700 files - at most, they will look at a few of those files to see how generated code has changed. But looking at a specific file is tricky due to the previous point

Agree. But sometimemes it's nice to see 1-2-3 generated files and I don't think pick 3 files and commit them is a good option here.

Rebasing/merging PRs with generated files is tricky because they almost always result in merge conflicts

Btw, we have such problem in very high level -- almost all PR change total state.csv, and I often face merge conflict on this file. Maybe better find some more generic solution here, like githooks?

aborg-dev commented 9 months ago

Agree. But sometimemes it's nice to see 1-2-3 generated files and I don't think pick 3 files and commit them is a good option here. Btw, we have such problem in very high level -- almost all PR change total state.csv, and I often face merge conflict on this file. Maybe better find some more generic solution here, like githooks?

Could we use instead use something like workflow artifacts to store both generated ZKASM files and state.csv file and present a link to them during the PR review? This way the reviewer will have the opportunity to look at the latest generated files and we won't have other problems mentioned above.

nagisa commented 8 months ago

Could we use instead use something like workflow artifacts to store both generated ZKASM files and state.csv file and present a link to them during the PR review?

From experience, nobody will bother look at the changes in the artifacts. The generated files themselves aren’t really interesting by themselves – the changes to them are.

The current snapshots serve a primary purpose of being something skimmable to enable the ability to spot-check the changes in the generated code. I don’t think they are meant to be carefully looked through exhaustively every time. However moving the snapshots out of the diffs hurts both the skimmability and spotcheckability aspects of the setup.


That said, I agree that having 700 changed snapshots isn’t particularly tenable, but that's just the natural outcome of us choosing an approach that lets us move fast with a minimal amount of effort.

Perhaps part of the problem with it is that we’re making snapshots not of the entire file but rather one for each single function in the wasm spectest suite. In practice a function that adds up 1 + 2 is mot meaningfully different codegen wise from a function that adds up 1 + 0xFFFF_FFFF in most cases and it probably does not warrant having a snapshot for each of them at all.

cranelift filetest suite avoids this in part by:

  1. Having runtests be a separate thing from snapshotted codegen tests;
  2. Having all the snapshots show up in the test file itself, rather than as separate files;
  3. Not striving to make each snapshot a codegen test an individually runnable full-blown with assertions and such.

Which brings me to a question: why is it that instead of augmenting filecheck to serve our needs (now that we're actually spending proper amount of time to set things up the right way,) we’re thinking of ways to improve the infra we handrolled as a shortcut? It sounds to me that the only valid end state is filecheck and any additional time we spend on the shortcut only makes sense if it actually gives a huge amount of value with very little investment in engineering resource.

nagisa commented 8 months ago

Just so that it doesn’t get lost: some discussion of what it would take to correctly integrate zkasm runtests into filecheck: https://bytecodealliance.zulipchat.com/#narrow/stream/217117-cranelift/topic/filetest.3A.20runtests.20that.20invoke.20external.20commands/near/402481325

MCJOHN974 commented 7 months ago

Resolved by https://github.com/near/wasmtime/pull/215