Consensys / evm-dafny

An EVM interpreter in Dafny
Apache License 2.0
126 stars 6 forks source link

Investigate Verification Problems with Dafny 4.0 #550

Closed DavePearce closed 1 year ago

DavePearce commented 1 year ago

(follows on from #538)

Since the upgrade to Dafny 4.0, there have been a number of problems with verifying code that previously wasn't a problem. I have now spent considerable time attempting to work around this, but to no avail at this time. As a result, a number of items in the code base (particularly under src/test/dafny/) have been marked as {:verify false} in order to allow some semblance of progress to continue. The point of this issue is to: (1) remember that at some point we want to remove the {:verify false} annotations; (2) to (somehow) investigate the issue further. Perhaps, even reverting Z3 used by Dafny to an older version.

A few things I have tried:

  1. Under #548 I have put through a number of optimisations which reduced the total resource usage from around ~120M to ~78M. However, that had no real effect on this issue.
  2. Experimented with some Dafny options, particularly /arith:2. That option does help in some cases, but at othertimes is a hindrance.
DavePearce commented 1 year ago

A good starting point would be to look at what is causing the test proofs to exhibit high deviation in proving time. For example on PR#553 we have this under Verification Logs (Proofs):

Name                                                        Resource Usage (CoV)
================================================================================
[5][CO] Test.Test_IR_04                                          90370993 (1.45)
[5][CO] Test.Test_IR_03                                          46504549 (0.97)
[5][CO] CallExamples.test_call_03                                28140987 (0.54)
[5][CO] Test.Test_IR_02                                          25570729 (0.36)
[5][CO] SimulationChecks.main5                                   21729496 (0.12)
[5][CO] CallExamples.contractReturns123                          18822102 (0.64)
[5][CO] Test.Test_IR_01                                          11284724 (0.75)
[5][CO] MemoryVerif_01.MSTORE_02_Proofs                           6730383 (0.49)
[5][CO] SimulationChecks.main1                                    6647152 (0.30)
[5][CO] MemoryVerif_01.MSTORE_01_Proofs                           6309538 (0.24)
[5][CO] MemoryVerif_01.MLOAD_02_Proofs                            6266956 (0.52)
[5][CO] Test10.main5                                              5026816 (0.12)
[5][CO] CallExamples.test_call_01                                 3821793 (0.23)
[5][CO] ByteTests.ReadTests                                       3731507 (0.13)
[5][CO] CallExamples.test_call_02                                 3552529 (0.19)
[5][CO] Optimisations.Proposition12a                              3278068 (0.08)
[5][CO] IntTests.NthUint8Tests                                    2449729 (0.06)
[5][CO] Test10Gas.main4aa                                         2431901 (0.20)
[5][CO] SimulationChecks.main0Func                                1622985 (0.04)
[5][CO] MemoryVerif_01.MLOAD_01_Proofs                            1579769 (0.08)
...
================================================================================
Subtotal (mean):                                                       295872706
Total (mean):                                                          317714819

Where Cov means Coefficient of Variation. Thus, for example, Test.Test_IR_04 is showing quite high variation for reasons unknown.

DavePearce commented 1 year ago

Apparently, /arith:6 onwards as a switch allows more things to verify. We should investigate this further.

DavePearce commented 1 year ago

We want a minimal example which illustrates the problems that we could use for discussion with the Dafny people.

DavePearce commented 1 year ago

From this thread, the suggestion is to use dafny verify --boogie /proverOpt:O:smt.arith.solver=6 ...

DavePearce commented 1 year ago

Closed by #584