hexojs / hexo-renderer-markdown-it

Markdown-it is a Markdown parser, done right. A faster and CommonMark compliant alternative for Hexo.
MIT License
343 stars 60 forks source link

feat: add multi-languages support for anchor #94

Closed snowyu closed 4 years ago

curbengh commented 4 years ago

non-ASCII doesn't work in <h1 id="">?

snowyu commented 4 years ago

@curbengh what's non-ascii? Example?

It uses the limax instead of the very old sluggo library.

All options will be passed to limax library. So you can customize your charmap to translate. See the custom option on the limax .

curbengh commented 4 years ago

ASCII is basically english alphabets.

non-english characters are usually in unicode.

The example used in the unit test is shown using unicode,

https://github.com/hexojs/hexo-renderer-markdown-it/blob/4c173f0967fd5fc661e4d0c57382766791743d1d/test/index.js#L227


Less than 3 months ago I did test in a browser on using unicode as anchor and it works fine if they are percent-encoded. I haven't test unicode anchor in this plugin yet, does the anchor get percent-encoded automatically?

Edit: can confirm anchor doesn't get percent-encoded automatically, neither hexo-renderer-marked.


I'm all for newer library. One advantage of romanizing the character is that it looks better than percent-encoding. But I do want to verify its performance first. I'll do some benchmark.

snowyu commented 4 years ago

Yes, The limax use the meaningful ascii instead of the ugliy percent-encoding which can not be understood by human.

curbengh commented 4 years ago

I tested the following characters,

https://github.com/hexojs/hexo-renderer-markdown-it/blob/4c173f0967fd5fc661e4d0c57382766791743d1d/test/index.js#L227

except for the japanese, they work fine as anchor (not percent-encoded).


For now, I would prefer to switch sluggo to slugize() (https://github.com/hexojs/hexo-renderer-markdown-it/pull/95). It's used by hexo and hexo-renderer-marked, so it's more consistent. It also retains case by default (e.g. A is not transformed to a, unlike sluggo).

curbengh commented 4 years ago
const Benchmark = require('benchmark');
const Suite = new Benchmark.Suite;
const { slugize } = require('hexo-util')
const sluggo = require('sluggo')
const limax = require('limax')

const test = 'Lorem ipsum dolor sit amet consectetur'

Suite.add('slugize', () => {
  slugize(test)
}).add('sluggo', () => {
  sluggo(test)
}).add('limax', () => {
  limax(test)
}).on('cycle', function(event) {
  console.info(String(event.target));
}).run();
slugize x 371,612 ops/sec ±2.83% (86 runs sampled)
sluggo x 500,227 ops/sec ±2.91% (87 runs sampled)
limax x 31,977 ops/sec ±2.37% (89 runs sampled)

slugize x 380,602 ops/sec ±2.67% (87 runs sampled)
sluggo x 506,575 ops/sec ±2.80% (86 runs sampled)
limax x 32,333 ops/sec ±2.19% (94 runs sampled)

slugize x 355,863 ops/sec ±3.48% (82 runs sampled)
sluggo x 452,010 ops/sec ±4.40% (83 runs sampled)
limax x 28,249 ops/sec ±4.99% (78 runs sampled)

limax is 90% slower

curbengh commented 4 years ago

https://github.com/hexojs/hexo-renderer-markdown-it/pull/95