markdown-it / linkify-it

Links recognition library with full unicode support
http://markdown-it.github.io/linkify-it/
MIT License
661 stars 63 forks source link

Schema keys are case-insensitive, but only in one direction? #102

Closed wolfgang42 closed 2 years ago

wolfgang42 commented 2 years ago

I'm trying to autogenerate links for ticket numbers which look like ABC-123. So I added a schema that I thought would match this:

markdown.linkify.add('ABC-', {
    validate: /[0-9]+\b/,
    normalize: match => match.url = `/bug/${match.url}`,
})

But this doesn't seem to work. After a lot of head-scratching, I discovered that it works if I make the key lowercase, like 'abc-', and this also matches the uppercase prefixes in my document. I don't really mind the case insensitivity, but it's very confusing that it only works one way, and having uppercase in keys just fails silently.

(I was also very confused by the docs for key, which say "Only whitespaces and punctuation allowed." even though the example skype: has alphabetic characters. After staring for a while I think this sentence fragment is referring to the previous sentence, which should maybe be rephrased as “linkify-it makes sure that this prefix only matches if it is preceded by whitespace or punctuation, not with alphanumeric characters or symbols.” Though it's still not entirely clear to me what the difference between a "punctuation" and a "symbol" character might be.)

puzrin commented 2 years ago

http://www.unicode.org/reports/tr44/#General_Category_Values - see unicode docs for "categories" definitions.

This package designed to search URI-like patterns, and may have specific behaviour. Your ticket pattern is not URI-like. It may work, but i can not guarantee/predict anything.

In general, i'd suggest to dig /index.js if any questions about logic (how schemas are compiled). It's quite simple (if you don't try to understand /lib/re.js)

wolfgang42 commented 2 years ago

This package designed to search URI-like patterns, and may have specific behaviour.

Well, I was going off of the “Example 2” from the README, and the ticket IDs aren’t any less URI-like than a Twitter handle, really. Having had another poke at it, even for URI-like things (like the git: handler mentioned in the docs) there's still an undocumented requirement that the key is lowercase. I presume the intention is that the schema name as passed to .add() is already in canonical form, but that’s not really obvious as documented.

puzrin commented 2 years ago

You can suggest docs update via PR if you wish.