HarvardPL / formulog

Datalog with support for SMT queries and first-order functional programming
https://harvardpl.github.io/formulog/
Apache License 2.0
155 stars 10 forks source link

Type-check error trying to cons a string to a string list, I'm probably cons-ing wrong #77

Closed bubaflub closed 10 months ago

bubaflub commented 10 months ago

As part of trying to parse a string IPv4 address into its binary representation, I've written a string split utility function that takes an input string haystack and the character to split on needle and returns a list of strings. I started with:

fun stringSplit(haystack: string, needle: string) : string list =
  let haystackChars = string_to_list(haystack) in
  let needleChar :: _ = string_to_list(needle) in
  stringSplit_(haystackChars, needleChar, [], [])

And originally I wrote the helper function stringSplit_:

(* this one doesn't type check *)
fun stringSplit_(haystackChars: i32 list, needleChar: i32, currentGroup: i32 list, sofar: string list) : string list =
  match haystackChars with
  | [] => sofar
  | current :: nil =>
    let lastGroup = cons(currentGroup, current) in
    let lastString = list_to_string(lastGroup) in
    stringSplit_([], needleChar, currentGroup, cons(sofar, lastString))
  | current :: rest =>
    if current = needleChar
    then
      let part = list_to_string(currentGroup) in
      stringSplit_(rest, needleChar, [], cons(sofar, part))
    else
      stringSplit_(rest, needleChar, cons(currentGroup, current), sofar)
  end

But I'm getting a type-check error:

Exception: Problem type checking function: stringSplit_
Type error in function: stringSplit_
match haystackChars with
    | [] => sofar
    | [current] => match (currentGroup :: current) with
    | lastGroup => match list_to_string(lastGroup) with
    | lastString => stringSplit_([], needleChar, currentGroup, (sofar :: lastString))
end
end
    | (current :: rest) => match beq(current, needleChar) with
    | true => match list_to_string(currentGroup) with
    | part => stringSplit_(rest, needleChar, [], (sofar :: part))
end
    | false => stringSplit_(rest, needleChar, (currentGroup :: current), sofar)
end
end
Cannot unify string and string list list
Problematic term: lastString

Which is surprising to me. If I reverse the order of arguments in my calls to cons I type check just fine but the lists are in the reverse order. stringSplit("11.2.3.4", ".") outputs ["4", "3", "2", "11"]. I can use the rev function from one of the included examples:

fun rev(Xs: 'a list) : 'a list =
  let fun cons_wrapper(Xs: 'a list, X: 'a) : 'a list = X :: Xs in
  fold[cons_wrapper]([], Xs)

And then reverse the lists where appropriate:

(* this one works *)
fun stringSplit_(haystackChars: i32 list, needleChar: i32, currentGroup: i32 list, sofar: string list) : string list =
  match haystackChars with
  | [] => rev(sofar)
  | current :: nil =>
    let lastGroup = cons(current, currentGroup) in
    let lastString = list_to_string(lastGroup) in
    stringSplit_([], needleChar, currentGroup, cons(lastString, sofar))
  | current :: rest =>
    if current = needleChar
    then
      let part = list_to_string(rev(currentGroup)) in
      stringSplit_(rest, needleChar, [], cons(part, sofar))
    else
      stringSplit_(rest, needleChar, cons(current, currentGroup), sofar)
  end

My questions:

  1. My mental model for cons was that it was adding an item to the end of the list. I'm guessing it actually pre-pends the item. Is that correct?
  2. Is there a way to construct this list so I don't have to reverse it later? I suppose it's not much of an actual concern but it feels like its a bit messy and maybe I'm doing some unnecessary work.
  3. I feel like I've got some repeated code in the various pattern matching blocks in stringSplit_. This pattern matching took a while to figure out and I'm not thrilled with my solution of having both a current :: rest block and a current :: nil block. Is there a better (more concise, more idiomatic) way to do this pattern matching block?
  4. Stepping back a bit, it would be nice to import or use a standard library where this function is already implemented instead of writing my own. What flavor of ML does formulog support? And can I import or otherwise re-use some existing ML standard library code somehow?
mgree commented 10 months ago
  1. cons sticks things on the front of a list. You can imagine a definition like: 'a list = nil | cons 'a ('a list)
  2. You might be looking for something like "accumulator passing style".
  3. It's not uncommon to special case x :: nil in a pattern match---that's just special handling for the singleton list.
  4. Unfortunately, Formulog's ML dialect is kind of its own. SML or OCaml programs should work with minor adaptations, but it's not as simple as copy/pasting or importing definitions. Sorry...
bubaflub commented 10 months ago

You're absolutely right. I checked some other ML/OCaml examples and I was just wrong about how cons works. I think I'll keep the code as-is for now and revisit it if it becomes too unwieldy or too slow.

Thanks!