speced / respec

A tool for creating technical documents and web standards
https://respec.org/
Other
720 stars 389 forks source link

Linter rule to check for non-en-US english in IDL #2078

Closed marcoscaceres closed 5 years ago

marcoscaceres commented 5 years ago

See this Twitter thread: https://twitter.com/marcosc/status/1093151940769341440?s=20

In W3C docs/specs, we should check WebIDL attributes and operation identfiers are in US English.

I wonder if there is some way of doing it? Need to research.

marcoscaceres commented 5 years ago

Maybe https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/i18n/detectLanguage

CodHeK commented 5 years ago

@marcoscaceres can I try this ?

saschanaz commented 5 years ago

@marcoscaceres That's a WebExtensions API but ReSpec is not an extension 😅

marcoscaceres commented 5 years ago

Oh whoops... totally missed that. I was wondering how I'd missed that API... would have been a "too good to be true" addition to the web :)

CodHeK commented 5 years ago

@marcoscaceres can we use this: https://www.npmjs.com/package/languagedetect ?

saschanaz commented 5 years ago

@CodHeK I don't think we want to add a 200+ KiB dependency. Condition: It should be small enough.

CodHeK commented 5 years ago

@saschanaz could you please help me find where the WebIDL attributes and operation identfiers are ?

saschanaz commented 5 years ago

https://github.com/w3c/respec/blob/d451e5efb187ba3b62205998fa0e12606a5a21f1/src/core/webidl.js#L114-L130

@CodHeK Here, but what are you going to do?

CodHeK commented 5 years ago

@saschanaz I am just trying to understand what the issue is about and see if we could do anything ...

CodHeK commented 5 years ago

@saschanaz what are the attributes/identifier here which need to be checked for the language ?

saschanaz commented 5 years ago

@CodHeK data.name should be checked there as that presents IDL identifiers. I think data.name is enough.

CodHeK commented 5 years ago

You mean the variable name at L119

marcoscaceres commented 5 years ago

I think this might be impossible to do reliably... @CodHeK, before looking at code, check if it's even possible at all without an actual real dictionary. I just don't think it is, because, for example:

attribute Laser laser;

So we would need some kind of dictionary lookup.... unless there is some actual Dictionary API on the web we would use, but that seems a bit extreme.

CodHeK commented 5 years ago

I got two packages, typo-js (83kb) and check-word (3kb) they both seem pretty lightweight, we could use them I guess .

https://github.com/cfinke/Typo.js

https://www.npmjs.com/package/check-word

marcoscaceres commented 5 years ago

ok, 83kb is way too massive... and looks like check-word doesn't give us the precision :(

https://github.com/S0c5/node-check-word/blob/master/index.js#L8

saschanaz commented 5 years ago

check-word is not 3 KiB as its dictionaries are already 37 MiB.

marcoscaceres commented 5 years ago

heh, yeah: https://raw.githubusercontent.com/S0c5/node-check-word/master/word-regexes/en-regex.js

marcoscaceres commented 5 years ago

it has the whole dictionary. That's awesome :) lol..

I think I'm going to close this.