Improve multiword lexing

probablykasper commented 3 years ago

Alternative to https://github.com/probablykasper/cpc/pull/16

@djmattyg007

probablykasper commented 3 years ago

Re-added pound-force and newton-meter parsing. Went for the approach that involves adding the Minus token from within the word parser.

djmattyg007 commented 3 years ago

You should port the test suite I wrote over to your branch to see how it performs.

djmattyg007 commented 3 years ago

I applied the following patch to your branch:

diff --git a/src/lexer.rs b/src/lexer.rs
index cad1cd9..c4370a9 100644
--- a/src/lexer.rs
+++ b/src/lexer.rs
@@ -178,7 +178,9 @@ pub fn parse_word(word: &str, lexer: &mut Lexer) -> Result<(), String> {

     "pi" => Token::Constant(Pi),
     "e" => Token::Constant(E),
-  
+
+    "plus" => Token::Operator(Plus),
+
     "mod" => Token::Operator(Modulo),

     "sqrt" => Token::FunctionIdentifier(Sqrt),

And then tried out a similar example to what I tried yesterday:

❯ cargo build --release --locked                                                                                        
   Compiling cpc v1.6.0 (/home/djmattyg007/repos/rust/cpc)                                                             
    Finished release [optimized] target(s) in 2.56s

❯ target/release/cpc '12 watts plus 23 watt'                                                                            
35 Watt

❯ target/release/cpc '12 watt plus 23 watt'                                                                             
Parsing error: Expected end of input, found Number(23) at 2

There are also still some issues with handling of pounds in your implementation:

❯ target/release/cpc '12 pound+ 23 pound'
Eval error: Cannot add Pound and NoUnit

❯ target/release/cpc '12 pound+ 23 pounds'
Eval error: Cannot add Pound and NoUnit

❯ target/release/cpc '12 pound + 23 pounds'
Eval error: Cannot add Pound and NoUnit

❯ target/release/cpc '12 pound + 23 pound'
Eval error: Cannot add Pound and NoUnit

❯ target/release/cpc '12 pound + 23 pound '
35 Pound

❯ target/release/cpc '12 pounds + 23 pound'
Eval error: Cannot add Pound and NoUnit

❯ target/release/cpc '12 pounds + 23 pounds'
Eval error: Cannot add Pound and NoUnit

❯ target/release/cpc '12 pounds + 23 pounds '
35 Pound

probablykasper commented 3 years ago

Fixed it, issue was that when checking pound it didn't add Pound if there's no character after it.

Turns out the performance regression earlier was actually because of a println!() statement.

You should port the test suite I wrote over to your branch to see how it performs.

Yeah, definitely, will be adding those

probablykasper commented 3 years ago

Copied in your tests, and added plus/minus/times as well. Let me know if you have any comments, then if everything's good I'll go ahead and merge.

What's missing from your PR is multiplied by, divided by and your rev/min fix. Do you want to open a new PR for those?

Really appreciate all your help with this!

djmattyg007 commented 3 years ago

Turns out the performance regression earlier was actually because of a println!() statement.

It's good to hear that the issue was just unnecessary I/O.

Copied in your tests, and added plus/minus/times as well.

I recommend adding the additional test cases that make small modifications to the input. They help to prove the resiliency of the lexer.

Let me know if you have any comments, then if everything's good I'll go ahead and merge.

I've give it a full review soon.

What's missing from your PR is multiplied by, divided by and your rev/min fix. Do you want to open a new PR for those?

I'm happy to take a look, yeah.

Really appreciate all your help with this!

No problem. I've learned a ton by doing all of this :)

djmattyg007 commented 3 years ago

Really appreciate all your help with this!

I must admit it's an odd feeling having to let go of my own implementation that I spent so long on. I have to remind myself that it's not about me, and that ultimately my work did still lead to a better overall outcome for everyone.

probablykasper commented 3 years ago

I recommend adding the additional test cases that make small modifications to the input. They help to prove the resiliency of the lexer.

I think I added all of them?

I must admit it's an odd feeling having to let go of my own implementation that I spent so long on. I have to remind myself that it's not about me, and that ultimately my work did still lead to a better overall outcome for everyone.

Yeah, I was contemplating whether or not it would be right to end up basically saying no thanks and then going for my own implementation :/ But you're right, your PR helped a ton even if it's not the one being merged.

djmattyg007 commented 3 years ago

I was referring to this bit:

https://github.com/djmattyg007/cpc/blob/rewrite_lexer/src/lexer.rs#L1013

They help prove that most whitespace isn't load-bearing.

probablykasper commented 3 years ago

I see, thanks for mentioning!

btw, do you know if there's any reason to not just use assert_eq!(tokens, expected_tokens)? Not sure why I decided to do it the way it is now

probablykasper commented 3 years ago

@djmattyg007 Think it's ready to be merged?

djmattyg007 commented 3 years ago

btw, do you know if there's any reason to not just use assert_eq!(tokens, expected_tokens)? Not sure why I decided to do it the way it is now

I think I just blindly assumed that that wouldn't work, because I'm not sure of exactly how that comparison would happen. You probably know more than I do, so if it works it's probably fine :)

Think it's ready to be merged?

The only thing I can think of is that the title of the PR needs to be updated, because the lexer is once again not recursive :stuck_out_tongue:

Otherwise, yeah, it looks good :+1: :tada:

probablykasper commented 3 years ago

Awesome, thanks! Thanks for working together with me on this ❤️

djmattyg007 commented 3 years ago

Thank you for building such a great piece of software for everyone to use :heart:

probablykasper / cpc

Improve multiword lexing #18