tagua-vm / parser

Safe, fast and memory efficient PHP parser (lexical and syntactic analysers, and the Abstract Syntax Tree)
http://tagua.io/
119 stars 15 forks source link

upgrade to nom 2.0 #83

Closed Geal closed 7 years ago

Geal commented 7 years ago

Hi!

as a part of the release process for nom 2.0, I selected a few crates to test it with, and make sure everything works correctly.

I changed a few things, and this project should work fine with nom 2.0 now!

If you want more information on this release, see the blogpost: https://unhandledexpression.com/2016/11/25/this-year-in-nom-2-0-is-here/ (reddit discussion: https://www.reddit.com/r/rust/comments/5espfm/this_year_in_nom_20_is_here/ )

There are new features that could be of interest for this project :)

Hywan commented 7 years ago

Thank you very much!

I did start a similar PR yesterday night, haha 😉, but I am glad to accept yours!

Geal commented 7 years ago

This is an amazing project, I really wanted it to work on 2.0 :)

Hywan commented 7 years ago

Same here 😉. What is the concrete difference between chain! and do_parse!? I am using them intensively.

Hywan commented 7 years ago

@Geal The new tag_no_case! is equivalent to itag! (defined in this project). Is the former faster and more performant? If you are planning to add UTF-8 support, will it be with the same macro? I really don't need UTF-8 support, and I am afraid it could slow things down.

Hywan commented 7 years ago

Hurray for named_attr! by the way. I did try to implement this one but it was really hard. Thanks!

Geal commented 7 years ago

tag_no_case! works on ascii strings by ignoring the bit that indicates case, while itag! has eq_ignore_ascii_case that uses a lookup table. I have not benchmarked the two, I'd be interested in the results.

If you pass a &str to tag_no_case!, it will lowercase both strings and compare, which is a really bad way to do it (people really shouldn't do case insensitive comparison on strings containing unicode chars).

Both implementations are separated in trait implementations: https://github.com/Geal/nom/blob/master/src/traits.rs#L288-L387

If you're worried about performance, benchmark tag_no_case and itag, keep the fastest, and keep casting to a &[u8] before comparing, that should work.

For named_attr!, I'm not the one who wrote it, but I'm really happy to see it too :)

Geal commented 7 years ago

For the difference between chain! and do_parse!: chain! is hard to maintain because of its weird syntax and unnecessary features. do_parse! is a simpler rewrite of this combinator. There's no more "?" (replaced with opt!) nor "mut" (people didn't really use it, apparently).

In this release, I also wanted people to break free from their chains (lol), because they use it everywhere, while there are a lot of nice helpers like pair! or terminated!

Hywan commented 7 years ago

people to break free from their chains

huhu 😉.

OK, thanks! I would be glad to compare

Hywan commented 7 years ago

@Geal Ahhh, named_attr! ❤️, https://github.com/tagua-vm/parser/pull/87.

Geal commented 7 years ago

wow you're fast 😮

Hywan commented 7 years ago

Awesomeness does not wait. Tagua VM is a high quality project, don't want to postpone things like that.