LightAndLight / ipso

A functional scripting language.
https://ipso.dev
17 stars 1 forks source link

Splitting the empty string should return the empty array #272

Closed LightAndLight closed 1 year ago

LightAndLight commented 1 year ago

Motivating usage (found during https://github.com/LightAndLight/ipso/issues/271):

cmd.run `command ${string.splitc ' ' words}`

If words == "" then this will call command with a single argument - the empty string - because string.splitc ' ' "" == [""].

It also increases the symmetry with string.join:

string.split sep "" == []
string.join sep [] == ""
LightAndLight commented 1 year ago

The symmetry is already preserved when we have string.split sep "" == [""], because string.join sep [""] == "".

While trying to "fix" this, I realised that the way Rust's way of doing it (which we reuse in the interpreter) is already correct. The extra empty strings that string.split generates are important for string.join to be the inverse of string.split.

I still think that string.splitc sep "" == [""] is an easy way to make a mistake - the motivating usage still stands. I can't think of anything better right now.

I considered

splitc : Char -> String -> { prefixes : Array String, suffix : String }

and

splitc : Char -> String -> (| Match : Array String, NoMatch : String |)

but I think it'd be too inconvenient.

For the original use in https://github.com/LightAndLight/ipso/issues/271, I suggest the following:

bind paths <-
  let paths = string.splitc ' ' <| string.stripc ' ' outPaths
  if paths == [""]
    then
      comp
        println "OUT_PATHS contains no paths"
        exit.failure
    else io.pure paths
cmd.run `/nix/var/nix/profiles/default/bin/nix copy --to $binaryCacheUrl $paths`
LightAndLight commented 1 year ago

\x -> string.splitc ' ' <| string.stripc ' ' x gestures at a potential set of functions:

partsc (delimiter : Char) (value : String) : Array String =
  let stripped = stripc c value in
  if stripped == ""
  then []
  else splitp (\c -> c == delimiter) stripped

parts (delimiter : String) (value : String) : Array String =
   ...

partsp (delimiter : Char -> Bool) (value : String) : Array String =
   ...

The split family of functions focuses on the "negative space" of the input value (the delimiter) - it removes the delimiter and nothing else, which is why string.join is its inverse. The parts family focuses on the "positive space" of the input value - the pieces that are left after removing the delimiter. parts is like a more general version of Haskell's words.

The original example could be written using partsc without any error checking:

cmd.run `command ${string.partsc ' ' outPaths}`
LightAndLight commented 1 year ago

Closing in favour of https://github.com/LightAndLight/ipso/issues/280