BNFC / bnfc

BNF Converter
http://bnfc.digitalgrammars.com/
578 stars 163 forks source link

Java serialization (pretty print) of strings does not escape special chars #449

Closed marco-comini closed 11 months ago

marco-comini commented 1 year ago

While predefined basic type String in Haskell backend serialization escapes special chars in strings (like "line1\nline2") in Java it does not (and output is "line1 line2").

andreasabel commented 11 months ago

This issue exists in all non-FP backends.

The Haskell backend produces the following printer for strings:

printString :: String -> Doc
printString s = doc (showChar '"' . concatS (map (mkEsc '"') s) . showChar '"')

mkEsc :: Char -> Char -> ShowS
mkEsc q = \case
  s | s == q -> showChar '\\' . showChar s
  '\\' -> showString "\\\\"
  '\n' -> showString "\\n"
  '\t' -> showString "\\t"
  s -> showChar s

The Ocaml backend produces:

let prtString (_:int) (s:string) : doc = render ("\"" ^ String.escaped s ^ "\"")

From the docs: https://v2.ocaml.org/api/String.html

escaped s is s with special characters represented by escape sequences, following the lexical conventions of OCaml.

All characters outside the US-ASCII printable range [0x20;0x7E] are escaped, as well as backslash (0x2F) and double-quote (0x22).

Java produces only this:

  private static void printQuoted(String s) { render("\"" + s + "\""); }

CPP this:

void PrintAbsyn::visitString(String s)
{
  bufAppend('\"');
  bufAppend(s);
  bufAppend('\"');
  bufAppend(' ');
}
andreasabel commented 11 months ago

While working on this issue, I found that two backends do not lex escape characters properly:

marco-comini commented 11 months ago

thank you