errata-ai / vale

:pencil: A markup-aware linter for prose built with speed and extensibility in mind.
https://vale.sh
MIT License
4.51k stars 155 forks source link

Add a way to combine dictionary processing across packages #628

Closed jtiee closed 10 months ago

jtiee commented 1 year ago

Check for existing issues

Describe the feature

I've started using Vale packages. Since a package can contain a Spelling.yml style, providing the Hunspell dictionaries alongside the style files in the package would be convenient.

When multiple packages define spelling styles, each style's spellchecks are evaluated independently. If a word like xyzzy is spelled correctly according to the dictionary in style B, spellchecking with style A flags the word as misspelled.

It would be great if all the dictionaries defined by all active styles could be processed together.

Related, it would be great if dicpath was relative to the Spelling.yml file or to the StylesPath instead of relative to where vale was executed. I recognize that this could be a breaking change, so perhaps a dictpath key could be introduced to handle the style-relative path, and using dicpath and dictpath together would be considered an error.

A style-relative dictpath would make it much easier to provide a style package that included its dictionaries and was agnostic to where vale is executed.

jdkato commented 1 year ago

Related, it would be great if dicpath was relative to the Spelling.yml file or to the StylesPath instead of relative to where vale was executed.

This should already be the case. There are few different places that Vale searches:

https://github.com/errata-ai/vale/blob/c0107e7e6f8326dc90b29320552fc95dd037e837/internal/check/spelling.go#L199-L209

jdkato commented 1 year ago

It would be great if all the dictionaries defined by all active styles could be processed together.

What would be your expectation for the case where a word doesn't appear in any of the dictionaries? In other words, there will be multiple rules (Style1.Rule, Style2.Rule, ...) reporting the same error.

jtiee commented 1 year ago

In that scenario, it would be acceptable for every style that provides dictionaries to report an error. That's vastly superior to the current situation where a word defined in one style's dictionary but not in another style's dictionary causes an error.

Ideally, Vale would recognize that the error repeats across styles and reports a single error.

It might be helpful to report in the error all of the styles providing dictionaries so that the user can decide which dictionary to update. I recognize those would violate the way Vale currently reports errors, so I leave that up to you.

jdkato commented 1 year ago

I've created a proposal (#688) for how I'd like to solve this: essentially, packages could contribute dictionaries to be consumed by Vale.Spelling instead of distinct spelling-based rules.

This handles both cases:

  1. An error will only be reported for words that aren't valid according to any dictionary.
  2. If an error is found, it will only be reported once (by Vale.Spelling).

Feel free to let me know what you think.

jtiee commented 1 year ago

I like the proposal and agree that it should solve both problems.

jdkato commented 10 months ago

This is out now.