alda-lang / alda

A music programming language for musicians. :notes:
https://alda.io
Eclipse Public License 2.0
5.58k stars 282 forks source link

client/parser/scanner.go: invalid rune quoting #492

Closed 3052 closed 8 months ago

3052 commented 8 months ago

runes should be quoted with %q, not '%c'

https://github.com/alda-lang/alda/blob/8608a982bdc5522ed9b6f697ad294bff61323903/client/parser/scanner.go#L34-L51

%q  a single-quoted character literal safely escaped with Go syntax.

https://godocs.io/fmt#hdr-Printing

daveyarwood commented 8 months ago

Hi, thanks for the discussion!

I don't fully understand the issue. I played around with this a bit in Go Playground, and as far as I can tell, the code we have now where we're printing '%c' (with added single quotes) produces identical results to what we would get if we switched to %q (without quotes). For example:

package main

import "fmt"

func main() {
    fmt.Printf("%%c is %c, %%q is %q\n", '🐧', '🐧')
}

Produces the following output:

%c is 🐧, %q is '🐧'

Is there an edge case that I'm missing?

3052 commented 8 months ago

yeah the single quote itself, '

daveyarwood commented 8 months ago

I'm inclined to leave this the way it is. The explicit single quotes in the Alda error message are unrelated to Go's choice to put single quotes around a rune when you print it via %q. Hypothetically, if we wanted to change the single quotes in the error message in the future, the way we have it now would make it easier to understand how to do that.

3052 commented 8 months ago

suit yourself, but in some cases you are going to end up with ''', which is obviously invalid

daveyarwood commented 8 months ago

That's a good example to consider!

I tried that in Go Playground just now, and I got this output:

%c is ', %q is '\''

Given the choice between presenting end users with ''' or '\'', I think I would actually prefer '''. Yes, it's weird-looking, but I think the backslash could be potentially more confusing.

daveyarwood commented 8 months ago

Another option worth considering here is to remove the single-quotes entirely and just display %c. For example, in the case of a single quote, an error message might look like:

Unexpected ' at the top level

whereas currently, it would be:

Unexpected ''' at the top level
3052 commented 8 months ago

%c is bad, because any control, white space, or otherwise non printing characters are not going to produce helpful output.

daveyarwood commented 8 months ago

Agreed, although we do have explicit checks for those kinds of characters before we fall back to the %c representation.

I appreciate the suggestion to use %q instead. I don't see an immediate need to switch, but I will keep it in mind! Thanks again for the discussion.

3052 commented 8 months ago

as mentioned, the current checks are not robust, so other characters are going to fall through:

package main

import "fmt"

func main() {
   fmt.Printf("'%c'\n", 0x200E) 
   fmt.Printf("%q\n", 0x200E) 
}

result:

'‎'
'\u200e'

I would again argue that %q is more helpful because of this reason, and because as mentioned, '%c' produces invalid Go code if the input is '.