Determine comment tokens from injection layers

helix-editor / helix

A post-modern modal text editor.

https://helix-editor.com

Mozilla Public License 2.0

33.13k stars 2.46k forks source link

Determine comment tokens from injection layers #7364

Open the-mikedavis opened 1 year ago

the-mikedavis commented 1 year ago

toggle_comments should take language injections into account. For example in HTML:

<p>
  C-c on this line should use the HTML comment token(s).
</p>
<script type="text/javascript">
  // C-c on this line should use the javascript comment token(s).
  foo();
</script>

To do this we should look at the byte range for each Range in the current Selection and find the LanguageLayer in doc.syntax() (Syntax) with the smallest/deepest range that completely covers the selection range. We will then need a way to find the LanguageConfiguration for a given LanguageLayer and then we can use that LanguageConfiguration's comment_token for that range in the selection.

The changes for this will conflict with #1505 / #4718 so this feature may be better off waiting for those changes to land.

gabydd commented 1 year ago

Thanks for making this issue especially the implementation notes, I meant to make it this morning but got caught up with work. This does conflict a tiny bit with my pr but if someone does implement this it shouldn't be too hard to rebase.

I might also just pick this up after #4718

pascalkuthe commented 1 year ago

One complication will be that we have a language for comments that is injected into comments. We don't want to change the kind of comment tokens inside that obviously.

I think we want to do something similar for indentation (see https://github.com/helix-editor/helix/discussions/7381?notification_referrer_id=NT_kwDOA6_EWrM2ODA2NzgxOTM3OjYxODUwNzE0#discussioncomment-6230647) so having shared infrastructure for detecting language inside injections would be useful

the-mikedavis commented 1 year ago

For indentation and textobjects I think we will want to switch to running queries across injections using something like: https://github.com/the-mikedavis/helix/blob/8483de78624be0b52dcd4144bc54a5b918d369cd/helix-core/src/syntax.rs#L1271-L1309 (currently that lives on #2857 but could be ported out). That QueryIter gives the LayerId of the current capture, so maybe the function in Syntax that gets a LayerConfiguration by byte range should actually be two functions: find the LayerId for a given byte range and then find a LanguageConfiguration for a LayerId.

pascalkuthe commented 1 year ago

Note that I didn't mean indent queries (altough those should also be run across injections) but determining what kind of indent to use (tab, single space, four spaces,...). We could use th layerid from the query for autoindent but not for manual indent with > and < as we don't run a query there afaik

Splitting the functions in two sounds like a good idea tough 👍

the-mikedavis commented 8 months ago

I was looking at this to see if it overlapped with #9320 at all. They look like separate problems unless we want to do a large refactor of how the syntax.rs types are structured (specifically HighlightConfiguration, LanguageConfiguration, Loader and LanguageLayer). I pushed up a branch with a partial solution to this: https://github.com/helix-editor/helix/compare/17dd102e5cccbb2a9a0f0224af63e52f3dab846b...get-lang-config-by-injection-layer

We need to store some data on LanguageLayer so we can ask Syntax what the LanguageConfiguration is for a given injection layer. And we can use a version of https://github.com/helix-editor/helix/pull/5176#discussion_r1448073448 acting on the Syntax layers field to get the LayerId for a given byte range.

noor-tg commented 1 week ago

so any thing new about this ? or is there any experimental branch to test ?