While these symbols are generally considered valid in URLs, they are a part of Markdown syntax. This can cause problems in some MD renderers, like in Discord embeds, where URLs containing these symbols sometimes break.
Current master
[TEST](https://lv.wikipedia.org/wiki/C%C4%93su_kauja_(1578))
_[Chandrayaan 1](https://lv.wikipedia.org/wiki/Chandrayaan1 "Chandrayaan 1")_
My fork
[TEST](https://lv.wikipedia.org/wiki/C%C4%93su%5Fkauja%5F%281578%29)
_[Chandrayaan 1](https://lv.wikipedia.org/wiki/Chandrayaan%5F1 "Chandrayaan 1")_
Currently the second example is just plain wrong - an underscore between Chandrayaan and 1 has gone missing and the URL is broken. The first example is less obvious, but the brackets at the end confuse Discord on Android, while the encoded version works just fine.
Since I was already making this change, I also chose to encode the * symbol, as it is also significant in Markdown and could cause problems.
Test results:
yarn run v1.22.10
$ jest
PASS test/default-tags-codeblock.test.ts
PASS test/options.test.ts
PASS test/default-tags.test.ts
PASS test/special-cases.test.ts
Test Suites: 4 passed, 4 total
Tests: 54 passed, 54 total
Snapshots: 0 total
Time: 2.456 s, estimated 3 s
Ran all test suites.
Done in 3.56s.
Benchmark:
yarn run v1.22.10
$ cd benchmark && yarn run benchmark quick
$ node execute.js quick
-----------------------------------------------------------------------------
node-html-makrdown (reused instance): 16.0300 ms/file ± 9.15999 (5.8 MB/s)
node-html-markdown : 16.3114 ms/file ± 9.81729 (5.7 MB/s)
turndown (reused instance) : 34.5944 ms/file ± 18.5378 (2.69 MB/s)
turndown : 34.9459 ms/file ± 18.7913 (2.66 MB/s)
-----------------------------------------------------------------------------
Total Files: 25
Avg. file size: 95.21 kB
-----------------------------------------------------------------------------
Estimated processing times (fastest to slowest):
[node-html-makrdown (reused instance)]
100 kB: 17ms
1 MB: 172ms
50 MB: 8.62sec
1 GB: 2min, 57sec
50 GB: 2hr, 27min, 7sec
[node-html-markdown]
100 kB: 17ms
1 MB: 175ms
50 MB: 8.77sec
1 GB: 2min, 60sec
50 GB: 2hr, 29min, 42sec
[turndown (reused instance)]
100 kB: 36ms
1 MB: 372ms
50 MB: 18.60sec
1 GB: 6min, 21sec
50 GB: 5hr, 17min, 30sec
[turndown]
100 kB: 37ms
1 MB: 376ms
50 MB: 18.79sec
1 GB: 6min, 25sec
50 GB: 5hr, 20min, 44sec
-----------------------------------------------------------------------------
Speed comparison - node-html-makrdown (reused instance) is:
1.02 times as fast as node-html-markdown
2.16 times as fast as turndown (reused instance)
2.18 times as fast as turndown
-----------------------------------------------------------------------------
Done in 3.80s.
While these symbols are generally considered valid in URLs, they are a part of Markdown syntax. This can cause problems in some MD renderers, like in Discord embeds, where URLs containing these symbols sometimes break.
Code example:
Output:
Currently the second example is just plain wrong - an underscore between
Chandrayaan
and1
has gone missing and the URL is broken. The first example is less obvious, but the brackets at the end confuse Discord on Android, while the encoded version works just fine.Since I was already making this change, I also chose to encode the
*
symbol, as it is also significant in Markdown and could cause problems.Test results:
Benchmark: