RubixDev / syntastica

Modern and easy syntax highlighting using tree-sitter
https://rubixdev.github.io/syntastica/
GNU General Public License v3.0
16 stars 3 forks source link

How to to use `process` with a theme in latest version? #3

Closed diocletiann closed 1 year ago

diocletiann commented 1 year ago

Hi, thank you for this crate. It has made implementing tree-sitter syntax highlighting much simpler in my prototype editor. I updated to the latest version from Github so I can use the new themes and the new process_once no longer takes a theme as an argument. Is there a new way to generate Highlights with the theme colors included without rendering? Thanks.

RubixDev commented 1 year ago

Hi, I am glad you like the crate! Currently it is a good idea to use the latest git version, as I don't regularly update the crates.io crate yet. Just know that below a 1.0 release, the public API may change without notice.

Yes, the new process_once function does not take a theme anymore, because processing does not depend on the theme anymore. The resulting Highlights can now be rendered with different themes directly. This improves performance for example in the themes example.

I do not really see a good reason, why you would need to store Highlights with theme-specific styling information attached, but if you have one, I could add a new type ThemedHighlights (or similar) along with a different render function.

Also, as you say you are creating an editor, I don't think process_once is the function you should use. Instead, I would save a Processor instance between calls (see the three main use-cases of syntastica).

I hope this helps, and thanks again for using the crate!

diocletiann commented 1 year ago

Thanks for the quick reply. I used the process_once function because it gave me a Highlights Vec with text and color info that could be easily mapped to wgu_glyph rendering calls like the code below. If there's a new way to apply a theme (Melange FTW) and get the same data I'd be happy to try it.

for (text, style) in &buffer.highlights[row] {
   let text_color = match style {
      Some(style) => [
         style.color().red() as f32 / 255.0,
         style.color().green() as f32 / 255.0,
         style.color().blue() as f32 / 255.0,
         1.0,
      ],
      None => [1.0, 1.0, 1.0, 1.0],
   };

   glyph_brush.queue(Section {
       screen_position: (
           text_start,
           current_row_position - viewport.vert_offset,
       ),
       bounds: ((viewport.width * 2) as f32, buffer.line_height),
       text: vec![Text::new(text)
           .with_color(text_color)
           .with_scale(buffer.font_scale)
           .with_font_id(FontId(0))],
       ..Section::default()
   });
RubixDev commented 1 year ago

Ah I never thought about it like that, the Renderers are kinda always meant to be used for rendering things, but yea in your case it doesn't really work because the output isn't a String 🫤. Do you think it may be possible to achieve what you currently have if the Renderer trait could return any custom type, not just String? Depending on the context you have to store, you could save those things as struct fields on a new struct which then implements Renderer with an associative type of sth like type Output = Section (the methods on Renderer give you mutable access to self).

I haven't been near any editor or even GUI app developing at all, so I don't really have any experience with user interfaces outside terminals. Feedback from your side is therefore welcome :smile:

diocletiann commented 1 year ago

I found a way by looking up the theme key. Also, in both versions I needed to convert the &strs in Highlights to CompactString at the end of process(). Maybe there's another way around the Highlights lifetime issue.

The processing time is 10x faster than the old version, 5ms vs 50ms, faster than a 120hz frame and there is no more lag when typing.

I'm using Processor::try_from_provider, now sure if that is ideal. How can I get an instance of ConfiguredLanguages to use Processor::new? Thanks!

buffer.highlights = processor
    .process(rope_str, "rust")
    .unwrap_or_else(|err| panic!("parsing failed: {err}"))
    .into_iter()
    .map(|token| {
        token
            .into_iter()
            .map(|(text, key)| (text.to_compact_string(), key))
            .collect::<Vec<_>>()
    })
    .collect::<Vec<_>>();
    for row in viewport.first_row..=viewport.last_row {
        if let Some(_line) = buffer.rope.get_line(row) {
            let mut text_start = buffer.gutter_width;

            for (text, key) in &buffer.highlights[row] {
                let style = match key {
                    Some(key) => theme.get(*key),
                    None => None,
                };

                let text_color = match style {
                    Some(style) => [
                        style.color().red as f32 / 255.0,
                        style.color().green as f32 / 255.0,
                        style.color().blue as f32 / 255.0,
                        1.0,
                    ],
                    None => [1.0, 1.0, 1.0, 1.0], // default color if no style
                };

                glyph_brush.queue(Section {
                    screen_position: (
                        text_start,
                        current_row_position - viewport.vert_offset,
                    ),
                    bounds: ((viewport.width * 2) as f32, buffer.line_height),
                    text: vec![Text::new(text)
                        .with_color(text_color)
                        .with_scale(buffer.font_scale)
                        .with_font_id(FontId(0))],
                    ..Section::default()
                });

                text_start += text.len() as f32 * buffer.cell_width;
            }
RubixDev commented 1 year ago

Maybe there's another way around the Highlights lifetime issue.

I don't think there is a better way than having owned data :man_shrugging:

The processing time is 10x faster than the old version, 5ms vs 50ms, faster than a 120hz frame and there is no more lag when typing.

That's good to hear, performance is currently what I am most disappointed about, but I'm glad I was already able to get it down to something usable :+1:

I'm using Processor::try_from_provider, now sure if that is ideal. How can I get an instance of ConfiguredLanguages to use Processor::new?

Oops, seems like I still have some local changes... Processor::try_from_provider is the preferred option and will become Processor::new after I commit my local changes. I changed the structure of the LanguageProvider trait to allow for slightly better performance and then there will be no ConfiguredLanguages struct anymore. Currently you could create one by calling get_languages on your LanguageProviderImpl, but that shouldn't be required.


I can also see the problem you were having now. Looking up the theme key (as you now do) every time you need to render style text is one way (and I think currently the only way) to achieve this, but I think I will add a new Highlights type with baked-in theme information as I mentioned in my first comment. This will then allow you to use that type the same way you used Highlights before, and requires less theme map lookups.

diocletiann commented 1 year ago

I can also see the problem you were having now. Looking up the theme key (as you now do) every time you need to render style text is one way (and I think currently the only way) to achieve this, but I think I will add a new Highlights type with baked-in theme information as I mentioned in my first comment. This will then allow you to use that type the same way you used Highlights before, and requires less theme map lookups.

That would be great, thank you! And if there was a way to reuse your text &'src strs when rendering without generating thousands of owned strings on every parse, that would be ideal. I tried a few workarounds but no dice so far. I'll separate the owned string creation and measure how much time that adds on a large buffer.

I also ran into an issue where theme.get() was getting a lot of None and had to improvise. I didn't see the fallback color in the older process_once version.

    let default_style = theme.get("text").unwrap();

    for row in viewport.first_row..=viewport.last_row {
        if let Some(_line) = buffer.rope.get_line(row) {
            let mut text_start = buffer.gutter_width;

            // Draw highlighted text
            if !buffer.highlights.is_empty() {
                for (text, key) in &buffer.highlights[row] {
                    let style = match key {
                        Some(k) => match theme.get(*k) {
                            Some(s) => s,
                            None => default_style,
                        },
                        None => default_style,
                    };

                    let text_color = [
                        style.color().red as f32 / 255.0,
                        style.color().green as f32 / 255.0,
                        style.color().blue as f32 / 255.0,
                        1.0,
                    ];
RubixDev commented 1 year ago

And if there was a way to reuse your text &'src strs when rendering without generating thousands of owned strings on every parse, that would be ideal.

I mean as along as you can keep the source text alive, the references can be reused. I just don't know of any good way to save both a String and references to that string in the same struct. It's one of those few situations where I still have issues with satisfying the borrow checker. If you can think of anything I could change on my side though, I'm always open for suggestions.

I also ran into an issue where theme.get() was getting a lot of None and had to improvise. I didn't see the fallback color in the older process_once version.

Huh, interesting. This might be because in the old version, all keys the theme provided were valid captures, whereas now only keys in THEME_KEYS will ever get highlighted. What theme were you using before, and could you maybe list a few keys which now return None but didn't before? I may just need to extend that THEME_KEYS list.

RubixDev commented 1 year ago

Also a small side-note: couldn't this

let style = match key {
    Some(k) => match theme.get(*k) {
        Some(s) => s,
        None => default_style,
    },
    None => default_style,
};

be rewritten as

let style = key.and_then(|k| theme.get(*k)).unwrap_or(default_style);

or something similar?

diocletiann commented 1 year ago

Huh, interesting. This might be because in the old version, all keys the theme provided were valid captures, whereas now only keys in THEME_KEYS will ever get highlighted. What theme were you using before, and could you maybe list a few keys which now return None but didn't before? I may just need to extend that THEME_KEYS list.

When using Gruvbox Dark, these keys return none:

"punctuation.bracket"
"punctuation.delimiter"
"keyword.coroutine"

some from Melange:

"function.call"
"keyword.coroutine"
"keyword.operator"
"keyword.return"
"punctuation.bracket"
"punctuation.special"
"text" // tried using this as the fallback
"type.builtin"
RubixDev commented 1 year ago

Ah, that is because gruvbox only defines keyword but not keyword.coroutine. Internally I already had a function to accommodate for that, but with commit 9b62f2cdc20438798f7c264e09ac387c88160d08 it is now a public method on ResolvedTheme called find_style, which you can simply use instead of theme.get().

As for text in Melange, that theme simply does not define a style for text, just for text.literal and a few others.

https://github.com/RubixDev/syntastica/blob/7db7261d271ab4bb734d05066f5f06fd93d1396b/syntastica-themes/src/melange.rs#L55-L60

RubixDev commented 1 year ago

Okay and with commit f49d65541e87663368882741826bd13d0af01a21 there now is a ThemedHighlights type which you can create from Highlights using the resolve_styles function in the renderer module. I'll leave this issue open until you can confirm whether this works for you.

diocletiann commented 1 year ago

Thanks! I just updated the crate and I'm getting compile_error!("current feature set includes no parsers");.

Processor created like this:

let language_set = LanguageSetImpl::new();
let processor = Processor::new(&language_set);
RubixDev commented 1 year ago

Thanks! I just updated the crate and I'm getting compile_error!("current feature set includes no parsers");.

That error occurs when you use a parser collection without specifying the group to use (enable one of some, most, and all). That should've already been the case before though, so unless you accidentally removed the feature flag while updating, I don't really know what could cause this. Which parser collection are you using?

diocletiann commented 1 year ago

My mistake, I messed up Cargo.toml when I changed sources.

RubixDev commented 1 year ago

Good, so is everything working now?

diocletiann commented 1 year ago

Sorry, not quite. Having lifetime issues. I was able to put the old language provider in a OnceCell but this one doesn't like it.

error[E0597]: `language_set` does not live long enough
   --> src/lib.rs:152:40
    |
152 |     let mut processor = Processor::new(&language_set);
    |                         ---------------^^^^^^^^^^^^^-
    |                         |              |
    |                         |              borrowed value does not live long enough
    |                         argument requires that `language_set` is borrowed for `'static`
...
434 | }
    | - `language_set` dropped here while still borrowed
RubixDev commented 1 year ago

Huh, the lifetime requirements should've stayed the same if you were previously using try_from_provider. I assume you mean once_cell's Lazy type? If so, you may need to store the language set in a separate Lazy cell first and then reference that. I can't really tell much without seeing the code you are using though.

diocletiann commented 1 year ago

I was using std::sync::OnceLock but here is once_cell::sync::Lazy:

error[E0277]: `UnsafeCell<[Option<HighlightConfiguration>; 9]>` cannot be shared between threads safely
  --> src/lib.rs:48:23
   |
48 | static LANGUAGE_SET2: Lazy<LanguageSetImpl> = Lazy::new(|| {
   |                       ^^^^^^^^^^^^^^^^^^^^^ `UnsafeCell<[Option<HighlightConfiguration>; 9]>` cannot be shared between threads safely
   |
   = help: within `LanguageSetImpl`, the trait `Sync` is not implemented for `UnsafeCell<[Option<HighlightConfiguration>; 9]>`
   = note: required because it appears within the type `LanguageSetImpl`
   = note: required for `once_cell::imp::OnceCell<LanguageSetImpl>` to implement `Sync`
   = note: required because it appears within the type `OnceCell<LanguageSetImpl>`
   = note: required for `once_cell::sync::Lazy<LanguageSetImpl>` to implement `Sync`
   = note: shared static variables must have a type that implements `Sync`
RubixDev commented 1 year ago

That's because LanguageSetImpl does not implement Send and Sync, you could wrap it in a Mutex or similar though

diocletiann commented 1 year ago

What would that look like? I tried a couple of version but I'm still learning Rust.

diocletiann commented 1 year ago

This lets me pass a reference to the processor to other functions in the winit event loop:

let language_set = Box::new(LanguageSetImpl::new());
let language_set_ref = Box::leak(language_set);
let mut processor = Processor::new(language_set_ref);
RubixDev commented 1 year ago

This lets me pass a reference to the processor to other functions in the winit event loop:

let language_set = Box::new(LanguageSetImpl::new());
let language_set_ref = Box::leak(language_set);
let mut processor = Processor::new(language_set_ref);

Yes, this way you leak the memory used by the LanguageSetImpl and never clean it up, but if you want the processor to be available until the end of program execution anyways, I don't think there really is a reason to not do it this way.


What would that look like? I tried a couple of version but I'm still learning Rust.

Something like this:

static LANGUAGE_SET: Lazy<Mutex<LanguageSetImpl>> = Lazy::new(|| Mutex::new(LanguageSetImpl::new()));

But unless multiple threads need to access this variable, it would probably be cleaner and faster to store it somewhere non-global, like in a struct field where all struct methods can access it.

diocletiann commented 1 year ago

Lazy<Mutex<LanguageSetImpl>> and Lazy<Arc<Mutex<LanguageSetImpl>>> both run into lifetime issues.

ThemedHighlights works great, thanks again. I think I'll stick with the leak for now since the processor instance stays alive with the program, at least until this editor is more usable. Then I can try to come up with a struct that ensures the correct lifetime. Good luck with the crate!

RubixDev commented 1 year ago

Okay 👍. Thanks and good luck with your editor!