mystor / rust-cpp

Embed C++ directly inside your rust code!
Apache License 2.0
798 stars 44 forks source link

Port to syn 0.15 #42

Closed ogoffart closed 5 years ago

ogoffart commented 6 years ago

Porting to newer syn is required in order to support new rust connstruct.

Fixes issue https://github.com/mystor/rust-cpp/issues/41

syn and proc_macro2 do not allow to get the line number or the verbatim captrued text: https://github.com/alexcrichton/proc-macro2/issues/110 This therefore has to redo the lexing manually to extract the relevant pieces for which we need these information. Only the more comples parsing is still done by syn.

This is a major change, so i took the opportunity to also run rustfmt

ogoffart commented 6 years ago

The CI is seems to be failling with nightly. But I cannot reproduce the failure locally when i install nightly. It looks like for some reason the hash is not matching between cpp_build and cpp_macro for one function. I have no clue why that can be.

ogoffart commented 6 years ago

@mystor any comments?

mystor commented 6 years ago

Hey, sorry I haven't had much time to look at this patch yet, especially with how large it is. Ill try to look at it this weekend but that's probably the earliest I can get to it.

mystor commented 6 years ago

I should also mention that syn 0.15 should be releasing soon (https://github.com/dtolnay/syn/issues/476), and it might be worthwhile to directly port to it. The entire parser engine has been replaced.

ogoffart commented 6 years ago

This means that cpp_common::parsing will need to be rewritten (again! :-))

However, I guess most of the rest of the code should not be affected, in particular the manual lexing will have to stay since proc_macro2 still does not allow to get the line information or offset.

I'll have a look at syn 0.15

ogoffart commented 6 years ago

I made the port to syn 0.15: https://github.com/ogoffart/rust-cpp/pull/1 (It is based on this work)

ogoffart commented 6 years ago

About the extraction of stringify! from cpp_macro, i implemented both alternative using explicit de-structuring (what is currently in this pull request) or use syn::visit to find the macro: https://github.com/ogoffart/rust-cpp/commit/0a75608e3e71c16bb7149de813454901fc12b778 I'm not sure what is the best. Using visit is less code and is cleaner. Probably resist better to changes in the cpp! and cpp_class! implementation, or in the syn AST. However it is probably less efficient, as it needs to re-visit the whole tree. And maybe it'd be a problem if one can have other macro (cpp_class!(#[foo(stringify!(bar)) struct F as "F") is not currently allowed, but maybe in the future?)

Overall, i think it is pretty important to upgrade syn as currently, using any of the new language constructs in a crate that wants to use cpp_build is impossible. That means already things like impl Traitor dyn Trait, use of the recently unreserved keywords, and probably more. The situation is quite bad, as we only get a warning from cpp_build that there was a parse error, without telling you where or what, and the rustc compilation works only that the cpp! function are not found. or just link error if it was already compiled before.

For this reason, I think it is quite important to release a new version fast with an updated syn.

Regarding the parsing done in cpp_build, I think we should not wait for proc_macro2 to have what we need (verbatim source code, and line/column info), as this can take a long time. Forking is not a good idea as we would have the same problem again. The custom parsing code done in cpp_build is quite simple and limited to lexing. Originally, I thought to simply find occurence of the regexpt \bcpp\s*! within the source code, and then finding matching parentheses to get the end. But that last part is impossible without proper lexing, as it can be quite complicated (example: cpp!( ... { /* } */ " \" ) /*"; })). So I imported the lexing code from proc_macro2. Change in the lexing rules might change un future rust versions, but are unlikely to change as much as the full grammar.

mystor commented 5 years ago

I'm fine with using the visitor, and it does seem quite a bit cleaner.

I think my worry with the need for manual tokenization is largely around the large complexity of having a complete copy of the lexing code within the crate. Perhaps we could make a separate crate which just handles string lexing and is used by rust-cpp, like the cpp_synmap crate used to be?

ogoffart commented 5 years ago

@mystor What is the advantage to put it in a separated crate compared to a separated module? Puting it in a different create means uploading another crate on crates.io, and also that means it needs somehow a public API and make it a full lexer. Because right now, this is not even a full lexer. I do not do keywords, or numbers or lifetime or symbols, these are just ignored.

What exactly do you want in this crate? Just put strnom.rs in its own crate? Add skip_literal and find_delimited?

mystor commented 5 years ago

I think I'm okay with this grossness for now, but I'll think on how to clean it up more. Merge away and I can publish new versions

ogoffart commented 5 years ago

Thank you.

mystor commented 5 years ago

Published as cpp 0.5 https://crates.io/crates/cpp. Thanks @ogoffart :-)