Closed iilyak closed 1 year ago
The line numbers reported where pointing to the next line instead of the line which was added into matches using matches = append(matches, {begin: begin, end: end}). I had a theory that it could be caused by a presence of a front-mater.
This problem is also caused by the presence of Chinese characters in the frontmatter. Once I removed the Chinese from frontmatter the vale reports correct line numbers.
By looking at a code (in core/location.go, check/script.go) I don't see a place where loc
reported from a script is adjusted to account for modifications to the text done inside the script. Which means if script modifies the text (by removing sections or codeblocks or anything else) the script itself is responsible for maintaining the correlation of line numbers in original text with modified text. Am I correct here?
Since line numbers are not working and correct matches cause a crash (and I need to validate a big corpus of texts) I was trying to disable the match by providing bogus positions.
matches = append(matches, {begin: 0, end: 0})
I also tried
matches = append(matches, {begin: 1, end: 1})
matches = append(matches, {begin: 1, end: 2})
The result is the same. I observer a crash like the following every time Chinese characters are present.
panic: runtime error: slice bounds out of range [:4709] with capacity 4096
goroutine 7 [running]:
github.com/errata-ai/vale/v2/internal/check.re2Loc(...)
github.com/errata-ai/vale/v2/internal/check/definition.go:129
github.com/errata-ai/vale/v2/internal/check.makeAlert({{{0x0, 0x0}, {0x0, 0x0, 0x0}}, {0x0, 0x0}, {0x14002f6d7b8, 0x6}, {0x1010e8966, ...}, ...}, ...)
github.com/errata-ai/vale/v2/internal/check/definition.go:133 +0x19c
github.com/errata-ai/vale/v2/internal/check.Script.Run({{{{0x0, 0x0}, {0x0, 0x0, 0x0}}, {0x0, 0x0}, {0x14002f6d7b8, 0x6}, {0x1010e8966, ...}, ...}, ...}, ...)
github.com/errata-ai/vale/v2/internal/check/script.go:61 +0x538
github.com/errata-ai/vale/v2/internal/lint.(*Linter).lintBlock(0x14003be5768, 0x14003fd0000, {{0x14005a10000, 0x1bef}, 0xffffffffffffffff, {0x140057ea6e8, 0x7}, {0x14005a10000, 0x1bef}}, 0x100e13bb8?, ...)
github.com/errata-ai/vale/v2/internal/lint/lint.go:229 +0x174
github.com/errata-ai/vale/v2/internal/lint.Linter.lintSizedScopes({{0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}, 0x14002da9240, 0x14003016d50, 0x101912880, 0x1, ...}, ...)
github.com/errata-ai/vale/v2/internal/lint/ast.go:193 +0x30c
github.com/errata-ai/vale/v2/internal/lint.Linter.lintHTMLTokens({{0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}, 0x14002da9240, 0x14003016d50, 0x101912880, 0x1, ...}, ...)
github.com/errata-ai/vale/v2/internal/lint/ast.go:147 +0x7e0
github.com/errata-ai/vale/v2/internal/lint.Linter.lintMarkdown({{0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}, 0x14002da9240, 0x14003016d50, 0x101912880, 0x1, ...}, ...)
github.com/errata-ai/vale/v2/internal/lint/md.go:60 +0x1c0
github.com/errata-ai/vale/v2/internal/lint.(*Linter).lintFile(0x1400018e140, {0x14002a82360?, 0x0?})
github.com/errata-ai/vale/v2/internal/lint/lint.go:169 +0x26c
github.com/errata-ai/vale/v2/internal/lint.(*Linter).lintFiles.func1.1.1({0x14002a82360?, 0x0?})
github.com/errata-ai/vale/v2/internal/lint/lint.go:114 +0x4c
created by github.com/errata-ai/vale/v2/internal/lint.(*Linter).lintFiles.func1.1
github.com/errata-ai/vale/v2/internal/lint/lint.go:112 +0x1e8
By looking at a code (in core/location.go, check/script.go) I don't see a place where
loc
reported from a script is adjusted to account for modifications to the text done inside the script. Which means if script modifies the text (by removing sections or codeblocks or anything else) the script itself is responsible for maintaining the correlation of line numbers in original text with modified text. Am I correct here?
No, the script just has to report locations accurate to itself. In other words, script_text[start:end]
needs to match what you intend the alert to be about.
Vale handles the location adjustments internally.
Context
I wanted to write a script extension which makes sure the order of items in the numbered list (in the markdown document) is in the correct order. I've noticed odd behavior. The line numbers reported where pointing to the next line instead of the line which was added into
matches
usingmatches = append(matches, {begin: begin, end: end})
. I had a theory that it could be caused by a presence of a front-mater. So I've decided to replicate the script from vale's own test suite https://github.com/errata-ai/vale/blob/5c5c0487bc4160db875f2601b4e133bab936594d/testdata/styles/Scripts/Test.yml and use it to repro the case.Symptom
When I tried to reproduce I witness a crash. The crash didn't happen when I remove Chinese characters.
The document
Stacktrace