rems-project / sail

Sail architecture definition language
Other
613 stars 109 forks source link

String pattern matching #286

Open gleachkr opened 1 year ago

gleachkr commented 1 year ago

Hi! The manual suggests that pattern matching on strings ought to be possible. I think sail has probably moved on from the syntax given there, but I can't seem to figure out whether this is still supported. Here's a minimal example:

default Order dec
$include <prelude.sail>

val hello : string -> nat
function hello(s) = match s {
    "hello" ^ s => 0,
    _ => 1
}

using sail -i, this throws "Function hello calls function marked non-executable".

Alasdair commented 1 year ago

Sorry for taking a while to reply!

We don't support some forms of string-related logic, but for backwards compatibility code that uses those features is marked as non-executable, and the compiler statically checks it is unreachable so it can be discarded.

In general, we want to guarantee that no logic in a specification relies on strings in any way, because as soon as control-flow logic (like an if statement or match) relies on comparing strings we can't generate sensible systemverilog or SMT theories.

nwf-msr commented 1 year ago

I am not sure how you'll feel about this "workaround", @Alasdair, but I have found that I want to do some string parsing not so much in the model but in the emulator's top-level, in __SetConfig, and so I have found that I can match against literal strings and, when I need to pull one apart, appeal to stacked partial patterns:

$include <prelude.sail>
$include <mapping.sail>

mapping mdig : string <-> int = { "0" <-> 0, "1" <-> 1 }
mapping vdig : string <-> int = { "v" ^ mdig(x) <-> x }

val main : unit -> unit
function main() = {
  match "v0" {
    "baz" => print_endline("exact string literal"),
    mdig(v) => print_int("can directly scan for digits: ", v),
    vdig(v) => print_int("can't ^ here, but can use a stacked pattern: ", v),
    _ => print_endline("and this works, too")
  }
}
nwf-msr commented 10 months ago

Hrm. Somewhat unsurprisingly, that workaround no longer works as of 0.17.1. It probably shouldn't have even when I posted it. :)

I would like to have just enough string parsing to parse my command line arguments and tear apart instruction mnemonics for a very primitive assembler. I think I could get away with something where there's no need for backtracking search because every string-destructing pattern disjunct has prefix-free initial literals on the case arms. Do you think something like that could be implemented for the C and OCaml backends, even if not the theorem provers'? (Again, my string use cases come down to "frontend" and, at least in my model, are quite readily severable from the "computational" aspects one might want to use in proving.)

Alasdair commented 10 months ago

We could add functions like is_prefix, index_of, and so on that are simpler and have well-defined semantics.

I have some ideas how to get the string append patterns 'working', although I think I can realistically only support them in our emulator-ish targets (C/OCaml/Interpreter) so I'll still need to quarantine them somehow.

nwf-msr commented 10 months ago

Yeah, quarantining them to emulator-ish targets seems fine.

I realized that "prefix-free choice" is probably a little too strict; at least, the upstream assembler for the internal architecture I'm modeling has optional {in,suf}fixes in its mnemonics, so at least also accepting the empty string in matches would be nice. (Though really, if that's the only thing that would be needed to make sail happy, I'm happy to push to change the assembler mnemonics.)