Implement string parser

inikolaev commented 1 year ago

I've been looking into implementing a proper string parser that supports interpretation of escape sequences as mentioned in the CEL specification.

I based my implementation on snailquote, but had to change it so that it can support things from CEL specification.

So far it supports all escape sequences and raw strings. I plan to look into supporting triple quotes as well, but in a separate PR.

Please let me know what you think about this approach - maybe you had some different thoughts on how to do this differently, but generally this approach seem to work and could be scaled to other tokens such as byte strings and even numbers.

I'm very new to Rust and would appreciate your feedback. I will look into adding some tests, but need to learn how to do that.

clarkmcc commented 1 year ago

Very nice! I'll dive into this and get back to you. Tests would be great, I've been adding them to the bottom of the files of the code they're testing. Checkout the functions.rs file for an example of how to annotate a test function.

clarkmcc commented 1 year ago

I haven't looked at the PR itself yet, but a couple things right off the bat to consider:

It looks like there are some regressions on existing unit tests, so we'd want to make sure your PR addresses each of those first.
At some point in the near future when I get the time for the next major CEL project, I'd like to get #18 merged. That doesn't mean that we can't merge this PR first, it just means that if we do, we'll need to implement this twice (if it's not already supported in #18 which something in the back of my brain is tickling me).

inikolaev commented 1 year ago

I haven't looked at the PR itself yet, but a couple things right off the bat to consider:

It looks like there are some regressions on existing unit tests, so we'd want to make sure your PR addresses each of those first.

At some point in the near future when I get the time for the next major CEL project, I'd like to get Add chumsky parser #18 merged. That doesn't mean that we can't merge this PR first, it just means that if we do, we'll need to implement this twice (if it's not already supported in Add chumsky parser #18 which something in the back of my brain is tickling me).

Yes, I have noticed some test failing, but haven't looked into them until now. The regular expression was too greedy, so I asked ChartGPT for help and now tests are passing :)

I'm currently adding some tests for the string parser.

No worries if we have to replace the implementation if new parser comes into the picture. I use this as an opportunity to familiarise myself with Rust, so happy to help.

inikolaev commented 10 months ago

@clarkmcc I've got some time to work on this now, would you mind approving the workflow to see how it looks now?

clarkmcc commented 10 months ago

Workflow approved! It's been a while since I looked at this PR but if I remember right there were a lot of allocations happening (String::from). It would be great if that could be optimized and I'm even wondering if using nom is the most efficient way to do this.

In any case, move in whatever direction you'd like to take this and we can look at performance after we've got something working.

inikolaev commented 10 months ago

@clarkmcc I have the removed the majority of String::from usages - only cases that handle unicode characters still use it, so I thing this should be good for now.

nom does look interesting indeed!

inikolaev commented 10 months ago

Actually checked the code once again and noticed that I was creating a string from a character when handling unicode - changed it to returned char instead.

clarkmcc commented 10 months ago

Thanks! I'm out on vacation now but will review once I return next week.

clarkmcc commented 9 months ago

Thanks for getting the tests working! Are you interested in adding parser benchmarks? I've been meaning to add them for a while and now seems like a good change. If not, no stress and I can get those added in myself.

inikolaev commented 9 months ago

Thanks for getting the tests working! Are you interested in adding parser benchmarks? I've been meaning to add them for a while and now seems like a good change. If not, no stress and I can get those added in myself.

I've added some simple benchmarks - do you think these are good enough for now or do you have anything else in mind?

clarkmcc commented 9 months ago

This is great for now. I'm part way through a review with a few feedback items. I started this weekend but have been sidetracked this week. I hope to have a review finished this weekend.

clarkmcc / cel-rust

Implement string parser #24