Closed Princesseuh closed 3 months ago
Latest commit: 2b2a8da47ee49ffbb4cd6bfd688a4ec36b6a394b
The changes in this PR will be included in the next version bump.
Not sure what this means? Click here to learn what changesets are.
Click here if you're a maintainer who wants to add another changeset to this PR
Yeah, I was surprised to not find too much perf difference. To be clear there is one, but I could only get real differences in files with 10k+ characters, multiple script and style tags late into the file and obscene amount of emojis.
I'll take a quick look to see if there's a way to re-use the line offset table from the sourcemapping logic, it'd speed up things a bunch, otherwise I'm not too bothered, this still ends up being faster for the language server because the previous logic it did to get script and style tags was expensive
The power of Go and native binaries 😄
Refactored to use the sourcemapping line offsets instead, it's much faster! Sourcemapping is still the ultimate bottleneck, though.
Changes
Previously when extracting script and style tags, we tried to somewhat make it work with multibytes characters by counting them and skipping, not only was this cumbersome, it kinda didn't work because our loop would run multiple times over kinda the same characters, anyway it was annoying.
In this PR, I changed it so that we use the lineoffsets table from the sourcemapping logic to get the offsets. This is somewhat slower, especially in some extreme cases, but in most cases there's no difference, and at least it's now correct.
I also updated the frontmatter and body ranges extraction to use this method, as they suffered from the same problem
In theory, this is a breaking change, but the truth is that the numbers it'd spit out would be unusable in JS unless you did a lot of conversion yourself, now the numbers can be used as-is. Also, I doubt anyone other than me is using them...
Fixes https://github.com/withastro/language-tools/issues/921
Testing
Tests should pass + updated some + added more
Docs
N/A. Though I did a JSdoc comment on the type to say that it's UTF-16 based.