vishpat / lisp-rs

MIT License
258 stars 25 forks source link

Some things that crash or give the wrong result #4

Open NicholasSterling opened 2 years ago

NicholasSterling commented 2 years ago

I love this project, and very much appreciate the effort that went into it. Kudos!

I'm not sure whether this is intended for actual use or just a simple teaching exercise, but I thought I'd mention a few things that failed (for some definition of "fail", e.g. crash or do something unexpected).

12 -- expects an outer set of parens

(4) -- evals to (4), not 4; (+ 3 1) evals to 4

(print "foo) -- prints foo)

(") -- evals to ())

Are you open to pull requests? There are also a few things that could be simplified.

vishpat commented 2 years ago

@NicholasSterling, please create a pull request against the main branch.

NicholasSterling commented 2 years ago

Done. The PR does not attempt to fix any of the errors, just suggested simplifications.

By the way, the lexer converts the UTF-8 String to a Vec and then processes the characters one at a time, from the beginning, removing each character as it goes. Removing a character from the front of a Vec requires moving the rest of them, so processing an N-char input is N^2.

NicholasSterling commented 2 years ago

@vishpat I am pretty sure you can fix the N^2 lexing by getting an Iterator for the input string and calling peekable() so that you can peek at the next element. I could do this for you, but since my other PR is just sitting there and you are making further changes to the lexer, I'll wait for you to resolve all of that.

If you decide to do this yourself, I'm suggesting getting rid of the Vec in tokenize() and doing

let mut iter = input.chars().peekable();

Then when you need characters, you can do

while let Some(ch) = iter.next() { ... } or just peek at the next char withiter.peek().

If you want me to do it, just give me the all-clear when the lexer dust has settled.

jcubic commented 2 years ago

(4) should throw an error, applying nonfunction.

vishpat commented 2 years ago

@NicholasSterling consider the lexer dust settled. Please open a PR

NicholasSterling commented 2 years ago

@vishpat OK, thanks -- will do so by the end of the weekend.

NicholasSterling commented 2 years ago

OK, that was too optimistic. I'll give an update in a few days.

NicholasSterling commented 2 years ago

I'm overdue for an update. I decided to do something more sophisticated, with these goals:

  1. Fix the N^2 problem.
  2. Only allocate memory when necessary (e.g. not for keywords or numbers).
  3. Make line and column number info available, for diagnostics.

I looked in crates.io for something that would help, and Tokenator looked promising, but it looked a bit clumsy to use, so I decided to do it a bit differently. I have been working on a new crate that helps one write tokenizers. It appears to be working, but I need to do more testing.

NicholasSterling commented 2 years ago

@vishpat After much thrashing over the API, the tokenizer crate I have been working on (token-iter) is ready. If you are OK with it, I would like to make lisp-rs use it, let you check it out, and then publish the crate. If you would rather not use a separate crate for tokenization, I understand, but please let me know either way.

https://github.com/NicholasSterling/token-iter/blob/main/src/lib.rs

NicholasSterling commented 2 years ago

@vishpat Have you had a chance to look at the token-iter crate? Are you OK with depending on an external crate for lexical analysis? https://docs.rs/token-iter/latest/token_iter/