Open ehildenb opened 5 years ago
Liq-2.0
and see what our coverage there is, if the unit tests are helpful for coverage.Vat_flux-diff_pass_rough
), and it appears we are not using sub-lemmas. So I'm trying to figure out the best way to add log information about why the lemmas are not being applied.#gas
is a functional term. Just had to modify the generated sublemmas to add matching(infGas)
to tell it "go ahead and match on this particular function symbol".#if_#then_#else_#fi
. I'm working through this one currentnly. Also AFAICT, it only actually ends up affecting 1 proof that I'm currently running. Just not sure if it's quietly affecting other proofs, or will affect some of the bigger proofs.addui
, mului
, etc... lemmas. These have trouble applying because they have a function symbol on top of the wordStack
(that is, they have things like chop(ABI_y)
on the lhs of the sub-lemma), which the java-backend won't try to match/unify with whatever exists there (usually a concrete or symbolic number). After trying several techniques to get these to apply, I'm disabling these lemmas for now, because they're small and don't have an appreciable impact on performance.- add package.json - npm init -yf
- add truffle-config.js
- move test out of src/
For each test
- fix test imports
- import "ds-test/test.sol" => import "../lib/ds-test/src/test.sol"
- import "../dai.sol" => import "../src/dai.sol"
- replace dapp asserts with truffle ones
- import "truffle/Assert.sol"
- assertEq => Assert.equal
- assertTrue => Assert.isTrue
- I did it with a regex: s/assertEqual(\(.*\))/Assert.equal(\1, "");
- remove everything related to Hevm
- Hevm hevm declaration in test contract
- hevm.warp calls (https://github.com/dapphub/dapptools/tree/master/src/hevm#cheat-codes) (edited)
CALL
, and in many of those proofs up to 10% performance improvement): kframework/evm-semantics#952With 5 min timeout, 12 in parallel at a time (4 hrs):
tried: 813 - 0.8041543026706232
passing: 711 - 0.8745387453874539
timeout: 95 - 0.11685116851168512
Re-running with 20 min timeout, 8 in parallel at a time (4 hrs):
tried: 913 - 0.9030662710187932
passing: 840 - 0.9200438116100766
timeout: 52 - 0.056955093099671415
Re-running with 120 min timeout, 6 in parallel at a time (8 hrs):
tried: 936 - 0.9258160237388724
passing: 884 - 0.9444444444444444
timeout: 5 - 0.005341880341880342
Merged: vat-subui lemmas: https://github.com/kframework/evm-semantics/pull/949
Merged: WordPack abstractions for flopper: https://github.com/kframework/evm-semantics/pull/948
In Review: Better infinite gas reasoning: https://github.com/kframework/evm-semantics/pull/952
PRs to be opened: vow-cage-surplus-pass-rough, vow-cage-deficit-pass-rough, vat-frob-diff-zero-dart-pass, vat-fork-diff-pass-rough, flopper-dent-guy-diff-tic-not-0-pass, flapper-tend-guy-diff-pass-rough
Current Status (timeout is 120min, runtime is ~13hrs):
tried: 964 - 0.9535113748763601
passing: 923 - 0.9574688796680498
timeout: 6 - 0.006224066390041493
Remaining issues: (i) _/sWord_
lemma that I cannot prove sound yet, so I can't open a PR (majority of failing proofs are due to this. (ii) rpow proofs.
Full run of all the proofs:
[test "deps"]
step = rm -rf deps out
step = git submodule update --init --recursive
step = make include.mak
step = make deps -j3
[test "5min"]
step = rm -rf log-prove-5
step = echo % Timeout = 5 >> log-prove
step = echo >> log-prove
step = make prove -j12 -k KLAB='profile log-prove-5 timeout 300 klab' CHECK_SUB_LEMMAS=true || true
step = cat log-prove-5 | sort -h -k2 >> log-prove
step = echo >> log-prove
[test "120min"]
step = rm -rf log-prove-120
step = echo % Timeout = 120 >> log-prove
step = echo >> log-prove
step = make prove -j8 -k KLAB='profile log-prove-120 timeout 7200 klab' CHECK_SUB_LEMMAS=true || true
step = cat log-prove-120 | sort -h -k2 >> log-prove
step = echo >> log-prove
Status (still running):
tried: 955 - 0.9446092977250248
passing: 919 - 0.962303664921466
timeout: 6 - 0.0062827225130890054
The regression in the number of tried proofs is because some of the changes above caused gas expressions to change, which caused some passing quick-running proofs to fail. This shouldn't be hard to fix, haven't had time to investigate yet. There are 12 new proofs running, and 23 new proofs passing, but the quick-running proof that is now failing caused others to not be generated or run.
Current status:
tried: 973 - 0.9624134520276953
passing: 940 - 0.9660842754367934
timeout: 3 - 0.003083247687564234
Unfortunately we didn't take meeting notes from the beginning, but hopefully this helps the project owners on the MKR side report progress.