kevinushey / sourcetools

Tools for reading, tokenizing, and parsing R code.
MIT License
77 stars 3 forks source link

Plans to update sourcetools for new 4.1.0 syntax? #22

Closed MilesMcBain closed 3 years ago

MilesMcBain commented 3 years ago

Hi @kevinushey

I am interested in any plans to change the way {sourcetools} parses tokens for the new |> operator.

Right now we have:

sourcetools::tokenize_string("1 |> I()")                                                                                                                            
  value row column       type
1     1   1      1     number
2         1      2 whitespace
3     |   1      3   operator
4     >   1      4   operator
5         1      5 whitespace
6     I   1      6     symbol
7     (   1      7    bracket
8     )   1      8    bracket

Which is actually not too bad since it is fairly easy to transform that:

sourcetools::tokenize_string("1 |> I()") |> polyfill_base_pipe()                                                                                                    
  value row column       type
1     1   1      1     number
2         1      2 whitespace
3    |>   1      3   operator
5         1      5 whitespace
6     I   1      6     symbol
7     (   1      7    bracket
8     )   1      8    bracket

polyfill_base_pipe() is a function I have in {breakerofchains}, which is a revdep of {sourcetools} that is not on CRAN at the moment.

If you are planning to update the tokenizer for this syntax is there something I could follow to track the progress of that?

Thanks, Miles

kevinushey commented 3 years ago

Thanks for the nudge! I'll try to implement this soon.

kevinushey commented 3 years ago

Looks like I did implement this a while back:

https://github.com/kevinushey/sourcetools/commit/8827fa5dff46865bfc9603ef3f656c2f71e0fa95

So, with the development version of sourcetools, I see:

> sourcetools::tokenize_string("1 |> I()")                                                                                                                            
  value row column       type
1     1   1      1     number
2         1      2 whitespace
3    |>   1      3   operator
4         1      5 whitespace
5     I   1      6     symbol
6     (   1      7    bracket
7     )   1      8    bracket

Can you confirm?