Closed atomb closed 8 months ago
My add: --use-basename-for-filename should not be jsut a developer option -- I would find it essential for testing as a user.
Also --error-limit, --random-seed
/vcsLoad
is already covered by --cores
I would suggest --solver-log
to be a hidden option
/trace
is poorly defined. The documentation reads "blurt out various debug trace information". We should think about what it's used for and how to expose that with a good UI. It also seems quite similar to --log-format
What does /proverDll
add to the already existing --solver-path
?
What's the use-case of error-limit
?
How do you feel about letting customers use measure-complexity
for the options you identified that aren't in verify
, --log-format
, --log-file
, --isolate-assertions
, so measure-complexity
would be for performance debugging?
It appears to me that there is a whole set of options whose point is to print out various kinds of debug information (/trace among them). They could be combined into a hidden option --debug.
Oh, does --cores
allow a fraction to use that fraction of the available cores? I didn't realize that.
It's true that /trace
is poorly-defined, but it's very useful as a quick mechanism to see how things are progressing. I wouldn't use it in an automated build, typically, or write anything automated that depended on its contents. but very frequently use it when running things during development. I use it so often that I'd probably never use dafny verify
if it wasn't available. I also tend to use it in addition to --log-format
. But I am happy to have it as a hidden option, and perhaps as an argument to --debug
, as @davidcok suggests.
We could make it so that, if --solver-path
points to a plugin DLL, rather than an executable file, that it would load it as a plugin. I kind of like that approach. But we'd need to add code to make that work. I'm torn about --solver-log
. I wouldn't strongly oppose making it hidden, but enough people would probably use it and want to learn about it from the help that I'd slightly prefer it not to be hidden (unless maybe we have a --show-hidden
or something that gives you descriptions for all the options).
For error-limit
: the default error limit is 5, and sometimes you want to see all of the errors if there are more than that. At the very least, some of the tests depend on being able to report more than 5 errors per definition.
I think I'd prefer to have --log-format
, --log-file
and --isolate-assertions
in both verify
and measure-complexity
, though they are more strictly necessary in measure-complexity
.
Aaron - are you using /trace to show progress? I've though before that on large projects I'd want a --progress option to show progress through the files and methods as they are verified.
Yeah, that's generally what I'm using it for, along with frequently wanting to see time and resource use statistics as they go by. Ultimately, what we might want is a --progress
option in Dafny that's similar to, but perhaps not identical to, /trace
, with a more focused purpose. In the meantime, --debug trace
, or something like that, which enables exactly what /trace
does, would be nice.
--cores=50%
will use half of the available cores.
I don't understand /proverDll
. If you want to supply some custom C# code to use as a solver, then wouldn't you use --plugin=<assemblyPath>
? Why would you also need to use /proverDll
?
Maybe /trace
could be --verbosity=debug
?
Could you elaborate on the use-case of --log-format? What information does that give when you're doing dafny verify
that you wouldn't get from /trace
?
For error-limit: the default error limit is 5, and sometimes you want to see all of the errors if there are more than that. At the very least, some of the tests depend on being able to report more than 5 errors per definition.
👍
Boogie has a separate plugin mechanism, different from the Dafny plugin mechanism, which it specifically uses for solver plugins. It would be possible, however, for Dafny to introspect a plugin to figure out what kind it is, which I think could be a better user experience. We could support both with --plugin
.
The /verificationLogger:csv
(--log-format csv
) produces (would produce) a machine-readable CSV file containing all the statistics about verification time and resource used. It's what dafny-reportgenerator
uses as input. That information is mostly available in /trace
, but in a hard-to parse (and sometimes impossible to parse) form.
The /verificationLogger:text
(--log-format text
) option provides (would provide) both the outcome of each proof and the list of all assertions contained within each proof. It's particularly useful along with /vcsSplitOnEveryAssert
(--isolate-assertions
).
I like the idea of using --verbosity
for /trace
. I was just talking with @davidcok about adding modes to make Boogie quiet (only errors and warnings, no "Dafny program verifier finished with ..." message) or silent (no output at all, except for a return code), and having these be --verbosity
arguments in Dafny, too, seems like a good idea. I think I might call the three modes trace
, quiet
, and silent
, reserving debug
for a (future) mode with additional debug output, although I'm not strongly attached to that.
Boogie has a separate plugin mechanism, different from the Dafny plugin mechanism, which it specifically uses for solver plugins. It would be possible, however, for Dafny to introspect a plugin to figure out what kind it is, which I think could be a better user experience. We could support both with --plugin.
Ah, I was thinking Dafny users could use a Dafny plugin to configure a different solver back-end, but I guess you're saying Dafny users should be able to use Boogie plugins as well, so they don't need two separate .dlls. While this sounds like a nice UX, it also sounds a little complicated to me. If a Boogie plugin was built against a particular Boogie version, how do you know whether it works with a particular Dafny version?
That said, I'm OK with having a different option, like --boogie-plugin
, for supporting plugins with a Boogie plugin format.
The /verificationLogger:csv (--log-format csv) produces (would produce) a machine-readable CSV file containing all the statistics about verification time and resource used. It's what dafny-reportgenerator uses as input. That information is mostly available in /trace, but in a hard-to parse (and sometimes impossible to parse) form.
I'm skeptical about providing an 'API' through the CLI that is not meant for user consumption, because I feel it clutters the CLI and there are good alternatives. If customers want their software to integration with Dafny, they can consume one of Dafny's .NET packages and integrate with a C# API. Alternatively, the language server has verification related APIs as well. We could consider providing an api
command that provides single fire uses of the language server API.
One thing is that measure-complexity
is still marked as experimental, while verify
is not. How about we first evolve the API before we decide what to put into verify
? For example, I think we're still planning on moving the functionality of dafny-reportgenerator
into measure-complexity
. Would that effect the need for the CSV format?
Regarding the CSV format, things are grouped by assertion batches, but it's not clear which assertions are in each batch, so I don't understand what a user can do to evaluate these batches. To me it seems like assertion batches can only be defined by the assertions in them. If you don't provide that definition, it seems better to group by named verifiable (method/function/constant/datatype), so the user doesn't have to aggregate the information for each batch in a verifiable.
The /verificationLogger:text (--log-format text) option provides (would provide) both the outcome of each proof and the list of all assertions contained within each proof. It's particularly useful along with /vcsSplitOnEveryAssert (--isolate-assertions).
Regarding the text format, one of the tests (https://github.com/dafny-lang/dafny/blob/master/Test/logger/TextLogger.dfy#L14) produces an output where many of the verification batches have no assertion in them. What does that mean? There are also many repeated occurrences of the same batch, which is because multiple iterations have been specified, but it would be nice if the iteration number is shown so it's clear why there are multiple occurrences.
The /verificationLogger:csv (--log-format csv) produces (would produce) a machine-readable CSV file containing all the statistics about verification time and resource used. It's what dafny-reportgenerator uses as input. That information is mostly available in /trace, but in a hard-to parse (and sometimes impossible to parse) form.
I'm skeptical about providing an 'API' through the CLI that is not meant for user consumption, because I feel it clutters the CLI and there are good alternatives. If customers want their software to integration with Dafny, they can consume one of Dafny's .NET packages and integrate with a C# API. Alternatively, the language server has verification related APIs as well. We could consider providing an
api
command that provides single fire uses of the language server API.
Textual formats are a lot more portable than programmatic APIs. Using Dafny as a library is only really feasible from another .NET application. CSV files are supported by a nearly endless range of tools. I sometimes load the output of this analysis into a spreadsheet, for example, and it can be used with tools like csvkit
.
One thing is that
measure-complexity
is still marked as experimental, whileverify
is not. How about we first evolve the API before we decide what to put intoverify
? For example, I think we're still planning on moving the functionality ofdafny-reportgenerator
intomeasure-complexity
. Would that effect the need for the CSV format?
I don't think it would remove the need for CSV output. To functionality we provide, whether in an external executable or an internal command, will always be somewhat limited. If anyone wants to query the data in a way that we haven't implemented, it's nice for that to be an easy process.
More significantly, though, we're actively using all of this functionality in practice. If we don't make these options available in the new CLI, I don't think even a single one of my regular uses Dafny could be migrated away from using the old-style options.
Regarding the CSV format, things are grouped by assertion batches, but it's not clear which assertions are in each batch, so I don't understand what a user can do to evaluate these batches. To me it seems like assertion batches can only be defined by the assertions in them. If you don't provide that definition, it seems better to group by named verifiable (method/function/constant/datatype), so the user doesn't have to aggregate the information for each batch in a verifiable.
This format exists primarily for performance analysis, and therefore I think it makes sense to group things according to the unit for which performance data is available, which is at the level of SMT queries.
The /verificationLogger:text (--log-format text) option provides (would provide) both the outcome of each proof and the list of all assertions contained within each proof. It's particularly useful along with /vcsSplitOnEveryAssert (--isolate-assertions).
Regarding the text format, one of the tests (https://github.com/dafny-lang/dafny/blob/master/Test/logger/TextLogger.dfy#L14) produces an output where many of the verification batches have no assertion in them. What does that mean? There are also many repeated occurrences of the same batch, which is because multiple iterations have been specified, but it would be nice if the iteration number is shown so it's clear why there are multiple occurrences.
Regarding the first issue, the implementation of splitting that's currently in place frequently (or maybe always?) generates a split that has no assertions in it, and therefore that shows up in the verification results. I think we should fix that, for a number of reasons, but it seems to me that it's somewhat separate from the question of adding these command-line options.
Thanks for pointing out the lack of iteration number in the text
output. That's something we should fix.
Textual formats are a lot more portable than programmatic APIs. Using Dafny as a library is only really feasible from another .NET application.
Yes, but I do wonder whether it would be better to expose output that is optimised for machine but not human consumption, like JSON for example, or the assertion batch grouped CSV, through a separate api
command, and whether to give that the same structure as the language server API to prevent duplication, but thinking more about it I'm not so sure. I'll drop this point : )
This [CSV] format exists primarily for performance analysis, and therefore I think it makes sense to group things according to the unit for which performance data is available, which is at the level of SMT queries.
That sounds sensible, but what is a user supposed to do with the batch grouping, not knowing which assertions belong to a particular batch? One thing they can do is aggregate the batches per named thing (symbol), and then try to improve the performance of one those symbols. Is there another thing they can do?
I guess the report generator determines a standard deviation per assertion batch, so then the grouping per batch makes sense.
However, you could also expose the lowest level information: have a row for each assertion, and a column that mentions the batch number. Assertions belonging to the same batch will be in consecutive rows that contain the same data except for the assertion's source location and possibly a code snippet.
To me it seems (too) difficult to design an output format that will be used by downstream tools, without knowing what those tools will be. Currently we have the report-generator as a downstream tool but we're planning to replace that (right?).
I think we should fix that, for a number of reasons, but it seems to me that it's somewhat separate from the question of adding these command-line options.
Sounds good. Just wanted to point it out since I looked at the output of the options. Another thing I would consider is to add a 'code snippet' for each assertion, in addition to the description that's already there, like x / 3
in addition to 'divisor is always non-zero'.
Also, it seems like that users, when reading this output, would have to switch back and forth between their editor and the output a lot. It might be interesting to define a problem matcher for the Dafny IDE that works in conjunction with measure-complexity
, so users can measure the complexity of their proofs while in the IDE, and see the output of that measurement inline in their code.
More significantly, though, we're actively using all of this functionality in practice. If we don't make these options available in the new CLI, I don't think even a single one of my regular uses Dafny could be migrated away from using the old-style options.
These options are already available in measure-complexity
, right? When using them to get data for the report-generator, using measure-complexity
seems best anyways.
With #5150, #5185, #5186, I think we'll have most of what we need. I'm going to close this issue now that we have more fine-grained tracking in those issues.
Summary
Currently, the new top-level commands for Dafny provide most but not all of the frequently-used features of the tool. This issue is to track remaining options to fill in before 4.0.
Background and Motivation
Particularly when working on proof debugging and solver performance investigation, several of the most useful options are missing.
Proposed Feature
--isolate-assertions
toverify
(not justcheck-complexity
)/verificationLogger
--log-format
and--log-file
options toverify
/vcsLoad
--solver-load
option toverify
/errorLimit
--error-limit
option toresolve
[DRC: to all commands] -- this one needed to support testing with new CLI/rLimit
--resource-limit
option toverify
/trace
--verbosity
withsilent
,quiet
, andtrace
options/proverDll
--solver-plugin
option toverify
/proverLog
--solver-log
option toverify
/proverOpt
--solver-option
option toverify
--stdin
toresolve
(and everything else, transitively)DRC: Anything for verify needs to be available on translate/build/run in case those verify DRC: Others to consider: --random-seed --help-attr --deprecation something like /noCheating:0 to allow mix of assert and expect and assume something to easily /extractCounterexample
I fear if we don't supply what users need to tinker with proof performance, they may just not use the new CLI. So maybe also --legacy=" ... string of old options in old format " just in case
Alternatives
No response