clarkmcc / cel-rust

Common Expression Language interpreter written in Rust
https://crates.io/crates/cel-interpreter
MIT License
375 stars 21 forks source link

support for escaping in Bytes and bytes macro #64

Closed alexsnaps closed 3 months ago

alexsnaps commented 3 months ago

Quick note here, I went to implement \ x HEXDIGIT HEXDIGIT and \ [0-3] [0-7] [0-7] for escaping byte sequences in b"" BYTES_LIT, but from the spec's lexis, all the escaping from STRING_LIT should be supported...

BYTES_LIT      ::= [bB] STRING_LIT
ESCAPE         ::= \ [abfnrtv\?"'`]
                 | \ x HEXDIGIT HEXDIGIT
                 | \ u HEXDIGIT HEXDIGIT HEXDIGIT HEXDIGIT
                 | \ U HEXDIGIT HEXDIGIT HEXDIGIT HEXDIGIT HEXDIGIT HEXDIGIT HEXDIGIT HEXDIGIT
                 | \ [0-3] [0-7] [0-7]

So another way to go about this is to refactor parse_string to parse_bytes_literal(s: &str) -> Result<Vec<u8>, ParseError> instead, and have the "newer" parse_string call into it to then String::from_utf8 the parsed bytes... I think that'd be closer to the spec, tho a bigger change. This change mostly allows us to now represent arbitrary byte sequence and work with them, which wasn't possible before.

clarkmcc commented 3 months ago

So another way to go about this

I'm fine with this approach for now.