nvim-treesitter / nvim-treesitter-textobjects

Apache License 2.0
2.2k stars 199 forks source link

refactor: replace all `Range` with `TSNode` as much as possible #674

Open ofseed opened 2 months ago

ofseed commented 2 months ago

The Range type should only be used internally. To leverage more upstream API, we should convert it into a TSNode as much as possible.

ofseed commented 2 months ago

@clason we could continue our discussion in https://github.com/nvim-treesitter/nvim-treesitter-textobjects/pull/672#issuecomment-2300006562 here.

ofseed commented 2 months ago

Some textobjects need the make-range directive to be defined, and then we read them in metadata, so it's impossible to convert these Range into TSNode. Luckily there is already https://github.com/nvim-treesitter/nvim-treesitter-textobjects/pull/612 which attempts to remove usages of this directive. @ribru17 Is my understanding of this PR correct?

ribru17 commented 2 months ago

Pretty much, except I think there was a discussion to keep the predicate definition (although now unneeded) for backwards compatibility with user queries

ribru17 commented 2 months ago

Also I haven't done a thorough review, but is this a fully safe change? Sometimes metadata.range is needed over node:range() because the latter doesn't account for range changes from something like #offset!

ofseed commented 2 months ago

but is this a fully safe change?

This is what I am concerned about most too.

I think we can handle the problem with #offset!. #offset! tries to change the range of a TSNode, while #make-range! tries to create a new range, which has no corresponding TSNode. We can store TSMetadata like store TSNode.

clason commented 2 months ago

I think we always have to check the metadata; the question is just when (do we precompute that or not); that is the same as for any other query.

The difference here is that the original code uses the "metadata" ranges as primary objects, which could come from nodes or from #make-range!. So if the latter remains supported, we can't switch to a node-centric model. And if we drop support for it, the predicate is useless, and keeping it defined (but not working) would be a worse user experience than removing it (and telling users that relied on it for non-textobjects usage to just copy the old code in their config or plugin).

So the signs are starting to point to a full breaking rewrite on main and freezing master.

ofseed commented 2 months ago

The difference here is that the original code uses the "metadata" ranges as primary objects,

Honestly, it is a rather complicated situation. It uses ranges from TSNode primarily, and stores the range from metadata. the range in metadata is a Range4, so only when functions need a Range4(and the range in metadata exists), the range in metadata will be returned, otherwise it will always use the range from TSNode. This part of the code should be a patch added later.

So If #make-range! could be fully removed, #offset! would not be a big problem. Because there are only 15 usages of #offset! and all of them are used to exclude or include some chars, like parenthesis or spaces. This means we still could use TSNode internally instead of Range because these chars should not have an impact on sorting or comparing them.

So the signs are starting to point to a full breaking rewrite on main and freezing master.

Indeed. At least https://github.com/nvim-treesitter/nvim-treesitter-textobjects/pull/612 must be merged before we move on.

ofseed commented 2 months ago

Brief summary, three sources of range:

The first two sources could be handled if we use TSNode, but the last one does not.

clason commented 2 months ago

Because there are only 15 usages of #offset! and all of them are used to exclude or include some chars, like parenthesis or spaces.

In our queries. Experience has taught me the hard way that user configs (and third-party plugins) can be full of all kinds of stuff...

ofseed commented 2 months ago

Since the mainbranch has already broken user configurations, it wouldn't be a particularly big issue if it continues to break user queries, lol.

However, one point I want to discuss is whether, if #make-range! is removed, we can still transition from Range to TSNode while preserving all existing features. The current discussion leads me to believe that this is possible.

clason commented 2 months ago

Since the main branch has already broken user configurations, it wouldn't be a particularly big issue if it continues to break user queries, lol.

Well, that's not the right way of looking at this. Right now, main is a new plugin that users can but need not use. The breakage will happen when we freeze master and make main the default branch, which is what we are talking about doing since nobody wants to maintain two separate plugins (especially queries, and freezing master will make it unusable as soon as the next breaking parser update happens in nvim-treesitter).

TL;DR: We can and should do this, but it's absolutely not a "lol"ing matter.

ofseed commented 2 months ago

Yes, that might be a joke that isn't funny, queries could not be synced between two branches after that.

But if we are certain about making this change, we could first remove #make-range in the master branch. This will likely only require changes to the query, because after this, they will become TSNode, and the code on the master branch will still be able to handle them.

clason commented 2 months ago

But if we are certain about making this change, we could first remove #make-range in the master branch. This will likely only require changes to the query, because after this, they will become TSNode, and the code on the master branch will still be able to handle them.

That would also get quick feedback on outside use of #make-range!. But I'm also OK with making the switch early, but that requires adding some mapping helpers to reduce the unacceptable setup boilerplate on main (which is the only thing missing before we could declare it as an "early access" replacement for master).

However, one point I want to discuss is whether, if #make-range! is removed, we can still transition from Range to TSNode while preserving all existing features. The current discussion leads me to believe that this is possible.

Yes, that is the main design decision we need to make. And I agree that given the current information, this looks possible and desirable. Personally, I think #make-range! has value but is so intrusive, it should be handled upstream (ideally tree-sitter); if textobjects don't need it, we should not provide it.

ofseed commented 1 month ago

@clason Could you please take a look at e83e3ebed7c82d01ab97f612d3cab2c26e5032bf? Previously, we were using #make-range!, so the corresponding range didn’t have a capture, I think this is why we used iter_matches. However, https://github.com/nvim-treesitter/nvim-treesitter-textobjects/pull/612 removed #make-range!, switching to iter_captures should be safe since now a capture, which is a TSNode, should correspond to the text object we need. I haven’t maintained the queries in this repository, so I’m not 100% sure.

clason commented 1 month ago

Looks good to me; if this is incorrect, it should fail pretty badly (easy to manually test with one of the quantified captures here).

@ribru17 ?

ofseed commented 1 month ago

Oh, I realized there is indeed an issue with this change regarding quantified captures. My previous manual tests were too simple, sorry about that.

Quantified captures don’t combine multiple nodes into a single node; instead, they match these nodes separately while keeping them within the same pattern. Therefore, we need to use iter_matches to handle them.