Closed PeterJCLaw closed 2 years ago
Sounds sensible. You should probably have a brief look at the first few changes to asttokens.py
in #93 as they will be relevant to this. I imagine the ultimate solution will simply be converting the variable original_tokens
in _generate_tokens
to an optional parameter of _generate_tokens
, init_tokens
, and __init__
.
I imagine the ultimate solution will simply be converting the variable
original_tokens
in_generate_tokens
to an optional parameter of_generate_tokens
,init_tokens
, and__init__
.
Yup, that's pretty much what I had in mind.
You should probably have a brief look at the first few changes to
asttokens.py
in #93 as they will be relevant to this.
Ah, yeah, that's going to overlap a bit and I suspect that allowing all the various construction cases is going to be a bit complicated too.
For my understanding: what use is the ASTTokens
class if it doesn't have an AST to work with (i.e: both tree=None
and parse=False
)? Are there consumers who use it just for the token utils? (I've found them useful, though wouldn't have thought to use this library just for them).
I realise this is perhaps a bit beyond the scope of this issue and I'm perhaps late to the conversation (and a breaking API change), but I wonder whether it would make sense to have ASTTokens
always build up the aspects (tokens/ast) that it doesn't have rather than trying to accommodate the cases of partial data?
That would allow the ASTTokens
class to be simpler -- a lot of the assert
s that PR adds would go away and consumers wouldn't need to type-guard access to ASTTokens.tree
.
If the token-only or ast-only functionality is useful on its own then perhaps extracting them to standalone functions or base classes would make sense? That could leave the current ASTTokens
class pulling all the features together for those that want it, but allow subsets to be used on their own without complicating the code.
If you'd rather stick with the current design I'm happy to work with that, I just didn't want to end up creating a lot of complications for #93.
In the meanwhile I've put up a minimal PR for just the change to allow passing in a list of tokens: https://github.com/gristlabs/asttokens/pull/95. Happy to rebase that on #93 if you'd prefer that.
For my understanding: what use is the ASTTokens class if it doesn't have an AST to work with
@dsagal I'm also curious about this.
I wonder whether it would make sense to have ASTTokens always build up the aspects (tokens/ast) that it doesn't have rather than trying to accommodate the cases of partial data?
If the token-only or ast-only functionality is useful on its own then perhaps extracting them to standalone functions or base classes would make sense? That could leave the current ASTTokens class pulling all the features together for those that want it, but allow subsets to be used on their own without complicating the code.
The AST-only functionality mostly works, but if you ask for the source text range of a node that it doesn't support, it automatically initialises tokens so that it can use them. So splitting up the class sounds like a challenge.
For my understanding: what use is the ASTTokens class if it doesn't have an AST to work with
@dsagal I'm also curious about this.
I don't have any compelling use cases for when both tree=None
and parse=False
. All practical cases I know set one or the other.
So there's no reason this test is written like this?
Well, the reason is that ASTTokens supports a mode where it only deals with tokens, so tokenization-specific unittests use that mode. But I am not aware of practical uses of ASTTokens that rely on this mode.
Thanks for this library, I've found it really useful for building source manipulation tools.
I'm currently building a flake8 plugin and it would be great if there was a way to build an
ASTTokens
instance from an existing list of (stdlib) tokens. This would complement the ability to pass in an existing AST tree and allow cases which already have both available to avoid re-parsing the source code. (Flake8 makes both an existing AST and tokens list available to plugins).If this is something you'd be open to I'd be happy to put together a PR. (The change seems simple enough, though possibly I'm missing something)