GaloisInc / cryptol

Cryptol: The Language of Cryptography
https://galoisinc.github.io/cryptol/master/RefMan.html
BSD 3-Clause "New" or "Revised" License
1.14k stars 123 forks source link

abc not working in cryptol-remote-api nightly container #1310

Open michaelabernardo opened 2 years ago

michaelabernardo commented 2 years ago

Good morning!

We noticed in the nightly cryptol-remote-api container from two days ago that abc does not seem to be working as it previously did when trying to prove a property.

The error we get is argo_client.interaction.ArgoException: Internal error ArgoThreadTimedOut for W4_ABC and for abc,

argo_client.interaction.ArgoException: Internal error Failed to start the external solver: 
ArgoThreadTimedOut
Make sure you can start "/usr/local/bin/abc"
from the command line without issues.
michaelabernardo commented 2 years ago

Also, is this right in solver.py? W4_ABC: OnlineSolver = OnlineSolver("w4-any")

weaversa commented 2 years ago

Also, is this right in solver.py? W4_ABC: OnlineSolver = OnlineSolver("w4-any")

The commit for that is here: https://github.com/GaloisInc/cryptol/commit/5feef24c9a898cc5eb5df1a2af7f6aa968dcbad2

weaversa commented 2 years ago

The commit that changed how the solvers were installed in the container is here: https://github.com/GaloisInc/cryptol/commit/c8c122fac61191162c5c5c106735496a317ecba0#diff-92e1e9b869d970ba6ca75eebada07c31dfbda3bf4995b0db2fe5331980912df9

pnwamk commented 2 years ago

I'm not sure why things are timing out now where they did not do so before (this can happen sometimes with different solver versions of course, but I haven't specifically verified that's what's going on yet).

As a quick check, instead of using W4_ABC (which is incorrectly defined as you noted -- thanks! I've got a fix inbound for that), does using OnlineSolver("w4-abc") in its place fix the timeout issue by chance?

michaelabernardo commented 2 years ago

As a quick check, instead of using W4_ABC (which is incorrectly defined as you noted -- thanks! I've got a fix inbound for that), does using OnlineSolver("w4-abc") in its place fix the timeout issue by chance?

This didn't seem to help as far as I can tell. I mostly still got the timeout error, though in one place, I instead got:

argo_client.interaction.ArgoException: Internal error Could not parse solver response:
  Solver response parsing failure. 
*** Exception: ArgoThreadTimedOut
weaversa commented 2 years ago

@pnwamk I don't think the timeout error is really about timeouts. The first message in this thread provides: "Internal error Failed to start the external solver", as though abc is either missing or misbehaving (which happens frequently, see #915, #558, #355, #353, #233, #227, #205, and likely others. Also see some improvements put in saw to work around the brittle ABC SMTLIB2 interface: https://github.com/GaloisInc/saw-script/commit/a77b2b32c6603e307dfb9bb702f04f8b4b5c8a27 and https://github.com/GaloisInc/saw-script/commit/228c6cf9388ea828ef4ccd51266821e7728b17fb)

weaversa commented 2 years ago

@pnwamk We increased the timeout parameter on our runs and now abc runs are passing. Perhaps something changed in the client (in the past 3 or 4 months) that caused the timeout to work, or be more strict...or maybe the previous client was starting the timeout immediately, but because we were using abc (which was set to ALL) the timeout was being started after all solvers had spun up, and in the time it took to do that, abc was done? Just throwing out ideas...but, things are working again. Thanks!

pnwamk commented 2 years ago

Misc. notes/observations: the Failed to start the external solver message I think is a red herring, i.e., I suspect our timeouts are killing the solver process and the underlying SBV library doesn't specifically recognize that, and it issues that generic error message it uses for it's "catch all" case: https://github.com/LeventErkok/sbv/blob/f0b217934016e1f690fa10c43ba3e4d224decadb/Data/SBV/SMT/SMT.hs#L654-L656

pnwamk commented 2 years ago

Another possibly relevant note - with the switch to using the what4-solvers packaging of SMT solvers, we're currently using this specific commit of abc until we create a new what4-solvers release, whereas previously we were cloning and building the latest abc from source each time we built the cryptol-remote-api/Dockerfil docker image. I think it's possible this alone could account for the timeout changes...