roc-lang / roc

A fast, friendly, functional language.
https://roc-lang.org
Universal Permissive License v1.0
4.46k stars 313 forks source link

Fails to match on a Str in an if statement #6919

Closed lukewilliamboswell closed 3 months ago

lukewilliamboswell commented 4 months ago

It is parsing @DE (Custom "FOO") where I expect it to be @DE FOO.

For some reason the Str equality falls through the first if statement.

$ roc test Broken.roc
── EXPECT FAILED in Broken.roc ─────────────────────────────────────────────────

This expectation failed:

32│>  expect
33│>      input = "FOO,BAR,BAZ\nsome other content\n"
34│>      actual = fromCsvHeaders input
35│>      expected = Ok [@DE FOO, @DE BAR, @DE BAZ]
36│>      actual == expected

When it failed, these variables had these values:

input : Str
input = """
FOO,BAR,BAZ
some other content

"""

actual : Result (List DE) [InvalidCsvHeaders]
actual = Ok [@DE (Custom "FOO"), @DE BAR, @DE BAZ]

expected : [
    Err [InvalidCsvHeaders],
    Ok (List DE),
]
expected = Ok [@DE FOO, @DE BAR, @DE BAZ]

1 failed and 0 passed in 487 ms.
module []

DE := [
    Custom Str,
    FOO,
    BAR,
    BAZ,
]
    implements [Eq, Inspect]

fromStr : Str -> DE
fromStr = \raw ->
    if raw == "FOO" then @DE FOO
    else if raw == "BAR" then @DE BAR
    else if raw == "BAZ" then @DE BAZ
    else @DE (Custom raw)

fromCsvHeaders : Str -> Result (List DE) [InvalidCsvHeaders]
fromCsvHeaders = \input ->
    firstLine <-
        input
        |> Str.splitFirst "\n"
        |> Result.map .before
        |> Result.mapErr \_ -> InvalidCsvHeaders
        |> Result.try

    firstLine
    |> Str.split ","
    |> List.map fromStr
    |> Ok

expect
    input = "FOO,BAR,BAZ\nsome other content\n"
    actual = fromCsvHeaders input
    expected = Ok [@DE FOO, @DE BAR, @DE BAZ]
    actual == expected
lukewilliamboswell commented 4 months ago

The same thing happens if I use List U8

actual = Ok [@DE (Custom [70, 79, 79]), @DE BAR, @DE BAZ]
expected = Ok [@DE FOO, @DE BAR, @DE BAZ]
module []

DE := [
    Custom (List U8),
    FOO,
    BAR,
    BAZ,
]
    implements [Eq, Inspect]

fromStr : List U8 -> DE
fromStr = \raw ->
    if raw == Str.toUtf8 "FOO" then @DE FOO
    else if raw == Str.toUtf8 "BAR" then @DE BAR
    else if raw == Str.toUtf8 "BAZ" then @DE BAZ
    else @DE (Custom raw)

fromCsvHeaders : Str -> Result (List DE) [InvalidCsvHeaders]
fromCsvHeaders = \input ->
    firstLine <-
        input
        |> Str.splitFirst "\n"
        |> Result.map .before
        |> Result.mapErr \_ -> InvalidCsvHeaders
        |> Result.try

    firstLine
    |> Str.split ","
    |> List.map Str.toUtf8
    |> List.map fromStr
    |> Ok

expect
    input = "FOO,BAR,BAZ\nsome other content\n"
    actual = fromCsvHeaders input
    expected = Ok [@DE FOO, @DE BAR, @DE BAZ]
    actual == expected
lukewilliamboswell commented 4 months ago

Even further simplification

module []

fromStr : Str -> _
fromStr = \raw ->
    if raw == "FOO" then FOO
    else if raw == "BAR" then BAR
    else if raw == "BAZ" then BAZ
    else OTHER

expect
    actual = ["FOO", "BAR","BAZ"] |> List.map fromStr
    expected = [FOO, BAR, BAZ]
    actual == expected
basile-henry commented 3 months ago

You seem to have an empty character in your string "FOO". It shows up as a space in my editor when I copy/paste your snippet. Here's an xxd view of it:

$ xxd test.roc
00000000: 6d6f 6475 6c65 205b 5d0a 0a66 726f 6d53  module []..fromS
00000010: 7472 203a 2053 7472 202d 3e20 5f0a 6672  tr : Str -> _.fr
00000020: 6f6d 5374 7220 3d20 5c72 6177 202d 3e0a  omStr = \raw ->.
00000030: 2020 2020 6966 2072 6177 203d 3d20 22ef      if raw == ".
00000040: bbbf 464f 4f22 2074 6865 6e0a 2020 2020  ..FOO" then.
00000050: 2020 2020 464f 4f0a 2020 2020 656c 7365      FOO.    else
00000060: 2069 6620 7261 7720 3d3d 2022 4241 5222   if raw == "BAR"
00000070: 2074 6865 6e0a 2020 2020 2020 2020 4241   then.        BA
00000080: 520a 2020 2020 656c 7365 2069 6620 7261  R.    else if ra
00000090: 7720 3d3d 2022 4241 5a22 2074 6865 6e0a  w == "BAZ" then.
000000a0: 2020 2020 2020 2020 4241 5a0a 2020 2020          BAZ.
000000b0: 656c 7365 0a20 2020 2020 2020 204f 5448  else.        OTH
000000c0: 4552 0a0a 6578 7065 6374 0a20 2020 2061  ER..expect.    a
000000d0: 6374 7561 6c20 3d20 5b22 464f 4f22 2c20  ctual = ["FOO",
000000e0: 2242 4152 222c 2022 4241 5a22 5d20 7c3e  "BAR", "BAZ"] |>
000000f0: 204c 6973 742e 6d61 7020 6672 6f6d 5374   List.map fromSt
00000100: 720a 2020 2020 6578 7065 6374 6564 203d  r.    expected =
00000110: 205b 464f 4f2c 2042 4152 2c20 4241 5a5d   [FOO, BAR, BAZ]
00000120: 0a20 2020 2061 6374 7561 6c20 3d3d 2065  .    actual == e
00000130: 7870 6563 7465 640a                      xpected.

I think efbbbf is the unexpected part of the string.

Admittedly this probably shouldn't be allowed in string literals, and it would have been easier to catch if the compiler complained about it

lukewilliamboswell commented 3 months ago

Wow, thank you @basile-henry for looking into this. That makes so much sense. I "discovered" the issue while working with csv files, and copying bytes to and from the roc REPL etc.

I definitely think it's worth discussing how we could prevent another person from the same fate. I'll raise an ideas thread.

lukewilliamboswell commented 3 months ago

Closing this as the original issue is resolved. We can tackle the potential improvements under a different issue or PR.