Closed geographika closed 1 year ago
Hello!
Yes, Token.value
is defined as string, for performance reasons. It's defined here: https://github.com/lark-parser/lark_cython/blob/master/lark_cython/lark_cython.pyx#L20
But nothing's stopping you from converting them to lark.Token
s, like so:
return lark.Token.new_borrow_pos(t.type, float(t.value), t)
Thanks @erezsh, and thanks for this project!
This approach works fine for all the transformer functions that convert to any types that aren't str
.
Does this approach however negate any performance boosts from using lark_cython
? The full test suite went from 1min20 to 1min50 (a very rough benchmark).
I'll continue to play around and look at failing tests.
This shouldn't have any direct effect on lark-cython's performance. But it's possible that you are creating a lot of Token instances, and that's taking a lot of time. (I do remember mappyfiles having a lot of ints in them)
Token values in lark_cython are typed as str.
In my Transformer I'm changing the token values to their correct types, for example
int
:This throws the following error in lark_cython:
I return the full token in the transformer, rather than simply an
int
value, as later logic (for error handling etc.) takes advantage of the token properties:Do token values have to be strings in lark_cython? If so, is there any approach which would allow token values to be converted to support
int
,float
etc.?