eslint-community / eslint-plugin-security

ESLint rules for Node Security
Apache License 2.0
2.22k stars 109 forks source link

New Rule: disallow unicode confusable identifiers #117

Open mhofman opened 1 year ago

mhofman commented 1 year ago

Rule details

Compute the Unicode skeleton of declared identifiers and disallow if similar to an identifier already in scope

Related CVE

CVE-2021-42694

Example code

const loremIpsum = "latin only";
const lоrеmIрsum = "with Cyrillic ";
const lorem‍Ipsum = "with ZWJ";

Participation

Additional comments

The Zero-Width Joiner (\u200d) is a valid identifier character, even though some parsers like the ones used by typescript or Webpack fail to parse correctly.

Cyrillic characters in the example code is one case of confusable unicode character with latin character, but there are a lot of other possibilities, including confusion between non-latin characters. Unicode defines an algorithm to compute the skeleton of text, which we could apply to identifiers, and base the comparison on the skeleton instead of the identifier string.

First reported in https://github.com/eslint/eslint/issues/15240#issuecomment-961535750

mhofman commented 1 year ago

Related to #116

nzakas commented 1 year ago

When you say the zero-width joiner is causing a parsing error, where do you see that?

mhofman commented 1 year ago

Oh my bad, it's because I'm using typescript-eslint, and tsc is choking on ZWJ!

nzakas commented 1 year ago

Ah okay, good to know! I was confused because the default parser was working okay. 👍