Better errors for literal parsing

ISibboI commented 1 year ago

Right now when parsing a literal fails, evalexpr simply assumes it is supposed to be an identifier. We should introduce the basic rule that identifiers need to start with a letter, and numeric literals with a number (like in many major programming languages).

hexofyore commented 1 year ago

@ISibboI Is it okay If I do this? I am planning to change the conversion of Partial Token to Token Portion of Code. If it starts with letter or underscore, it will be parsed as Identifiers. Then, it will try parsing float and then integer?

ISibboI commented 1 year ago

Sure, that sounds good!

On Sun, 4 Jun 2023, 7.39 hexofyore, @.***> wrote:

@ISibboI https://github.com/ISibboI Is it okay If I do this? I am planning to change the conversion of Partial Token to Token Portion of Code. If it starts with letter or underscore, it will be parsed as Identifiers. Then, it will try parsing float and then integer?

— Reply to this email directly, view it on GitHub https://github.com/ISibboI/evalexpr/issues/134#issuecomment-1575383462, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASATXTEDP5CJ4Q53XYHLGDXJQGOZANCNFSM6AAAAAAYVOOQYM . You are receiving this because you were mentioned.Message ID: @.***>

ISibboI commented 1 year ago

Actually, the rule should be more like: when it starts with a number, then go for the number variants, and otherwise go for identifier. Such that identifiers can be arbitrary unicode not starting with a (arabic) number.

On Sun, 4 Jun 2023, 7.51 Sebastian Schmidt, @.***> wrote:

Sure, that sounds good!

On Sun, 4 Jun 2023, 7.39 hexofyore, @.***> wrote:

@ISibboI https://github.com/ISibboI Is it okay If I do this? I am planning to change the conversion of Partial Token to Token Portion of Code. If it starts with letter or underscore, it will be parsed as Identifiers. Then, it will try parsing float and then integer?

— Reply to this email directly, view it on GitHub https://github.com/ISibboI/evalexpr/issues/134#issuecomment-1575383462, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASATXTEDP5CJ4Q53XYHLGDXJQGOZANCNFSM6AAAAAAYVOOQYM . You are receiving this because you were mentioned.Message ID: @.***>

hexofyore commented 1 year ago

Doesn't rust and other programming language only support english letters and underscore for starting letter? Do they support other unicode characters?

ISibboI commented 1 year ago

Ideally, we would mimic Rust identifiers: https://doc.rust-lang.org/reference/identifiers.html

However, I am not sure if the standard library allows e.g. checking for a character being XID_Start, and I would not want to add any dependency for that. But if the standard library has a way to check if a string is a Rust identifier, then that would of course be great.

If not or if that is too much effort, then the following would be great: any sequence of characters with the Alphabet attribute (see https://doc.rust-lang.org/std/primitive.char.html#method.is_alphabetic) as well as the underscore character being identifiers, except that a single _ is not an identifier.

On Sun, Jun 4, 2023 at 8:29 AM hexofyore @.***> wrote:

Doesn't rust and other programming language only support english letters and underscore for starting letter? Do they support other unicode characters?

— Reply to this email directly, view it on GitHub https://github.com/ISibboI/evalexpr/issues/134#issuecomment-1575401021, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASATXSIGCPRAH7A3R65BYDXJQMLJANCNFSM6AAAAAAYVOOQYM . You are receiving this because you were mentioned.Message ID: @.***>

hexofyore commented 1 year ago

@ISibboI I am not so sure about this. I made small changes. Check and see what's missing. I will PR it

ISibboI / evalexpr

Better errors for literal parsing #134