bakpakin / Fennel

Lua Lisp Language
https://fennel-lang.org
MIT License
2.43k stars 124 forks source link

Compiling with Fennel built via 5.1.5 incorrectly encodes strings in Lua #386

Closed rktjmp closed 2 years ago

rktjmp commented 3 years ago

If you build Fennel with Lua 5.1.5, strings in the Lua output are incorrectly encoded. It effects fennel and fennel.lua.

Edit: updated to show failure to encode escaped " characters, which is what made me notice. Without that the code will still run, just looks unreadable.

Using https://github.com/asdf-vm/asdf makes it easy to swap Lua versions.

test.fnl

(print "this is a \"test\" string")

Build with Lua 5.4.0

asdf shell lua 5.4.0
make clean
make
./fennel -c test.fnl

Output

return print("this is a \"test\" string")
./fennel -c test.fnl | lua
this is a "test" string

Build with Lua 5.1.5

asdf shell lua 5.1.5
make clean
make
./fennel -c test.fnl

Output

return print("\116\104\105\115 \105\115 \97 \92"\116\101\115\116\92" \115\116\114\105\110\103")
./fennel -c test.fnl | lua
.../lua/5.1.5/bin/lua: stdin:1: ')' expected near '\'

diff of executables

diff fennel-5.1.5 fennel-5.4.0
170c170
<     io.write(table.concat(xs, "\t"))
---
>     io.write(table.concat(xs, "\9"))
1795c1795
<         return f:read("*all"):gsub("[\r\n]*$", "")
---
>         return f:read("*all"):gsub("[\13\n]*$", "")
1931c1931
<   local serialize_subst = {["\a"] = "\\a", ["\b"] = "\\b", ["\f"] = "\\f", ["\n"] = "n", ["\t"] = "\\t", ["\v"] = "\\v"}
---
>   local serialize_subst = {["\11"] = "\\v", ["\12"] = "\\f", ["\7"] = "\\a", ["\8"] = "\\b", ["\9"] = "\\t", ["\n"] = "n"}
1936c1936
<     return string.gsub(string.gsub(string.format("%q", str), ".", serialize_subst), "[x80-xff]", _0_)
---
>     return string.gsub(string.gsub(string.format("%q", str), ".", serialize_subst), "[\128-\255]", _0_)
2960c2960
<       return read_line_from_string(string.gmatch((source .. "\n"), "(.-)(\r?\n)"), line)
---
>       return read_line_from_string(string.gmatch((source .. "\n"), "(.-)(\13?\n)"), line)
3289c3289
<         local formatted = raw:gsub("[\a-\r]", escape_char)
---
>         local formatted = raw:gsub("[\7-\13]", escape_char)
3766c3766
<     escs = setmetatable({["\""] = "\\\"", ["\\"] = "\\\\", ["\a"] = "\\a", ["\b"] = "\\b", ["\f"] = "\\f", ["\n"] = _2_, ["\r"] = "\\r", ["\t"] = "\\t", ["\v"] = "\\v"}, {__index = _4_})
---
>     escs = setmetatable({["\""] = "\\\"", ["\11"] = "\\v", ["\12"] = "\\f", ["\13"] = "\\r", ["\7"] = "\\a", ["\8"] = "\\b", ["\9"] = "\\t", ["\\"] = "\\\\", ["\n"] = _2_}, {__index = _4_})
rktjmp commented 3 years ago

This may not be an issue in the real world, since you can run the 5.4.0 built executable in a 5.1.5 runtime, but it might catch someone out.

technomancy commented 3 years ago

I can't reproduce this problem; can you give some details about the system on which you're seeing this?

rktjmp commented 3 years ago

Sorry yes,

Arch Linux, Linux 5.13.10-arch1-1 #1 SMP PREEMPT Thu, 12 Aug 2021 21:59:14 +0000 x86_64 GNU/Linux

There was some talk of glibc versions, I have 2.33.

Both versions of Lua built clean before the test.

technomancy commented 3 years ago

Could you share your locale settings? We had a similar problem a while back that couldn't be reproduced without changing locale.

rktjmp commented 3 years ago
λ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES=
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=