Closed erbridge closed 4 years ago
Merging #260 into master will not change coverage. The diff coverage is
100%
.
@@ Coverage Diff @@
## master #260 +/- ##
=====================================
Coverage 100% 100%
=====================================
Files 3 3
Lines 85 80 -5
=====================================
- Hits 85 80 -5
Impacted Files | Coverage Δ | |
---|---|---|
filter.js | 100% <100%> (ø) |
:arrow_up: |
index.js | 100% <100%> (ø) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 727b1f9...10eba86. Read the comment docs.
Thanks for working on this, really well done! ✨
This may be a good addition, but I am wondering how many people it’ll benefit.
As you’re integrating alex into a new thing (instead of using the CLI), have you thought about using retext-equality
/ retext-profanities
directly?
Because alex is focussing a bit more on ease of use from the CLI, whereas with unified / remark / retext / those plugins you’ll have more control already!
I did consider rolling our own, but thought I'd propose this change and see if it made sense for the library as a whole. I do think it's useful. Alex is a wrapper that makes running these checks as simple as making a single API call, but retext-equality
and retext-profanities
skew heavily towards North American English usage. We decided it would be more reliable to explicitly choose which rules to implement, so as to reduce the surface of false positives that in themselves raise some of the more problematic language usage. Adding a deny list option is a small code change, which makes it much simpler for others to do the same, at little cost IMO.
If you do decide that this change is beyond the scope of Alex, though, I'll happily pull out the test changes and refactor into a new PR (if that's valuable), and roll our own implementation. Let me know.
Fair points, one extra question! When you say:
but retext-equality and retext-profanities skew heavily towards North American English usage
Could you elaborate on that? Because those two are the essence of alex, they’re where the warnings come from. So if they are North American focussed, then so is alex?
(Note I’d like to add more languages, and we have support for that in retext-profanities, but not yet in retext-equality, which we need more people for)
retext-equality
, in particular, has rules that would never come up in their problematic sense for British English speakers, but the fact that the rules would flag them effectively makes them problematic. There are also a number of rules that might make sense as quiet warnings that an individual might ignore when running on prose they're writing on their own, but would be disruptive when running on conversation or PRs (another use case we were looking in to) via a bot. An example of both is invalid
, which comes up a bunch for a tech company, and as far as a I know, is not problematic when used in the sense of validation rules.
Using a deny list would get around those issues by allowing us to pick and choose the rules that we think are valuable. It's easier to choose what to check than to ensure we catch all of the ones we don't want to check.
Adding this functionality to Alex means we get the benefit of a single API call with a familiar, well documented interface, and reduce the maintenance burden on us. If retext-equality
gains new features or has API changes, only Alex would need updating, and we'd get the benefits for free.
Hey those are great points, thanks for explaining!
The main question for me is:
I see alex as something that people can use to check their own writing. Because people are smarter than alex, but make mistakes, and alex is there to remind them. I’m not sure alex is good for something like Slack, as you found out with the deny list, and maybe that’s a feature instead of a bug? The question for me thus is: “is it alex, if it is only a couple of rules?”, and “is it alex, if its for generic conversation?”
And to respond to the rest:
Using a deny list would get around those issues by allowing us to pick and choose the rules that we think are valuable. It's easier to choose what to check than to ensure we catch all of the ones we don't want to check.
Yeah, I agree! When using the plugins directly, mixing and matching, you can do this already!
Adding this functionality to Alex means we get the benefit of a single API call with a familiar, well documented interface, and reduce the maintenance burden on us. If retext-equality gains new features or has API changes, only Alex would need updating, and we'd get the benefits for free.
I would argue that those two plugins and the rest of the ecosystem are better documented than alex 😅 Having maintained the unified ecosystem and alex on top for the last 5 years, and most probably for the next 5, without any promises, both will probably not change a lot. With the unified, you can do many more powerful things (such as the deny list, but also custom checks, spelling, so much more!)
Those are all good points.
I'm gathering that you'd rather not accept this feature, which is fine. I'll work on rolling our own instead of using Alex directly.
Would you be interested in the other commits? Those that add to the test suite and perform some refactoring?
I’m not really sure yet, the code is good and technically it’s a good change, but I’m wondering about the implications. I’d like to think about this a bit (and do welcome others’ opinions!), I hope that’s okay and that you can progress regardless!
That's totally fine. Let me know either way.
In the meantime, I ended up rolling another library with a similar goal to Alex that supports this behaviour, along with presets. In case you're interested.
Alright, sorry for the wait. I thought about this some more, and I think including your work is important. I also missed this initially but I love the ideas of presets!
I’d like to land this, and perhaps also include stuff like presets here? Are you still interested in working on this PR? (there are merge conflicts). And would you perhaps also be interested in working on getting presets into this project?
I'm pretty time poor right now, but I do think there's value in this. I'll try to take a look soon.
That’d be awesome! Let me know if there are parts you’d want me to take care of: where I can help!
I've done the rebase of this branch. I don't know when I'll have time to port in the preset stuff from Reilly, though.
Fixed the tests if you're able to take another look @wooorm?
Awesome, thanks @erbridge!! ✨✨✨
Sometimes it can be useful to be able to chose which rules to include, rather than exclude. For instance, while building a Slack bot to run Alex on our conversations, we found that we got far too many false positives, and wanted to be able to specify which rules to use. This feature supports that without any changes to current behaviour.
If both
allow
anddeny
are provided, we throw an error. I tried to follow the pattern of errors thrown elsewhere, but let me know if there's a more "Alex" way of handling conflicting configuration.There is a small refactor to unify applying the filter in
core()
, that will make it easier to add features like this in future.In order to be confident in these changes, I have also make the test suite more thorough, including adding more configuration combination testing.