blinry / legit

An esoteric programming language where programs are defined by the graph of commits in a Git repository.
https://morr.cc/legit/
138 stars 3 forks source link

Incorrect parsing of string literals containing escaped quote characters #4

Closed remuladgryta closed 5 years ago

remuladgryta commented 5 years ago

It appears that String#myshellsplit is not correctly tokenizing string literals containing quote characters.

Parsing of string literals breaks whenever the literal contains an escaped quote character followed by space, but works as expected if there are an even number of escaped quotes before any space.

Steps to reproduce:

git init
git commit --allow-empty -m "quit"
git checkout -b print
git commit --allow-empty -m "put [PRINT]"
git checkout master
git merge --no-ff -m "dup" print
git tag PRINT
git commit --allow-empty -m "\"\\\"rab\\\" dias oof\""

Running interpreter.rb on the repository outputs foo said "bar" as expected.

git commit --allow-empty --amend -m "\"\\\" rab\\\" dias oof\"" (note the space before rab) Running interpreter.rb on the repository now crashes with

Traceback (most recent call last):
    7: from interpreter.rb:163:in `<main>'
    6: from interpreter.rb:74:in `run'
    5: from interpreter.rb:74:in `loop'
    4: from interpreter.rb:77:in `block in run'
    3: from interpreter.rb:77:in `each'
    2: from interpreter.rb:78:in `block (2 levels) in run'
    1: from interpreter.rb:153:in `execute'
interpreter.rb:153:in `undump': unterminated dumped string (RuntimeError)

git commit --allow-empty --amend -m "\"\\\"rab \\\" dias oof\"" (note the space after rab) Running interpreter.rb on the repository now crashes with

Traceback (most recent call last):
    6: from interpreter.rb:163:in `<main>'
    5: from interpreter.rb:74:in `run'
    4: from interpreter.rb:74:in `loop'
    3: from interpreter.rb:77:in `block in run'
    2: from interpreter.rb:77:in `each'
    1: from interpreter.rb:78:in `block (2 levels) in run'
interpreter.rb:157:in `execute': Unknown command '"\"rab' (RuntimeError)
remuladgryta commented 5 years ago

Since no other instruction may contain spaces, I think String#myshellsplit could be altered to read as follows:

    def myshellsplit
        return self.scan /(?:                       # Match either
                                "                   # A string literal, which begins with "
                                (?:                 # and contains

                                        \\.         # any escaped character
                                    |               # OR
                                        [^"\\]      # any character which is not " or \

                                )*                  # of which there can be any number
                                "                   # finally string literals end with a "

                            |                       # or match a non-string instruction
                                \S+                 # which are all composed of one or more
                                                    # non-whitespace characters
                          )/x
    end

I tested this with the example programs and a quine I've been working on and they did not break, but I might be missing some edge case they don't cover.

remuladgryta commented 5 years ago

Another bug that gets fixed by this implementation is that multiple consecutive spaces in a string literal get squashed down to one space. E.g.

"Hello                  world\n"

gets squashed to

"Hello world\n"
blinry commented 5 years ago

Thanks so much! Tell me more about that quine you're working on! :O

remuladgryta commented 5 years ago

Sure! It's a pretty standard quine approach: take a string literal payload and print it once as a string literal and once as program code, making sure to add some cruft to the ends to make it all line up. Unfortunately it's not exactly readable code but the repo is available here: https://github.com/remuladgryta/legit-quine I might get around to adding comments, license, and the helper script I wrote to generate the payload string eventually. If not, feel free to use it under the MIT license.

blinry commented 5 years ago

Oh wow, awesome! :O When I tried to run it, I noticed that the tags being jumped to are not defined – instead, on GitHub, these seem to be branches? Does the program work for you?

I'm gonna give a presentation about legit tomorrow (in NYC, if you're near by any chance), and your contribution will be featured prominently! :D

remuladgryta commented 5 years ago

Sorry, that's my mistake, I forgot to also push the tags. It should be remedied now. If it something's still amiss, here's a gist of the git commands: https://gist.github.com/remuladgryta/d8e93deedb9d02238d55f2efe87b6d1f

source legit-quine.sh && ruby /path/to/legit/interpreter.rb . should output the same contents as legit-quine.sh