Closed weihong1028 closed 7 years ago
I think the issue is here: https://github.com/mixu/markdown-stream-utils/blob/master/lib/annotate-md-headings.js
token.id = token.text.toLowerCase().replace(/[^\w]+/g, '-');
I'm guessing the Chinese title matches /[^\w]+/g
and hence becomes -. I think I'd need a piece of code that instead takes any non-word characters and converts them into base64 codes that correspond to the relevant (raw) characters since it looks like that's what the Chinese wikipedia does when it converts non-alphanumerics to a header id. Any chance you could write a PR to the repo I mentioned above with a fix (and ideally a test)?
edit: example from Wikipedia: https://zh.wikipedia.org/wiki/1949%E5%B9%B4%E5%AE%89%E5%B7%B4%E6%89%98%E5%9C%B0%E9%9C%87#.E5.9C.B0.E8.B4.A8
I think you are right.
The Chinese title will becomes -
when it matches /[^\w]+/g
.
And I just use encodeURIComponent()
to convert them.
token.id = encodeURIComponent(token.text.toLowerCase());
Now it works.
var id = this.options.headerPrefix + encodeURIComponent (raw.toLowerCase());
-{num}
Hello, I want to display the link raw like ../../#基本理解
,not like ../../#%20%JD%90
.
So I don't want to useencodeURIComponent
to convert, what is your opinion?
token.id = token.text.toLowerCase();
yeah I think token.id = token.text.toLowerCase().replace(/\s+/g, '-');
seems fine - according to http://stackoverflow.com/questions/70579/what-are-valid-values-for-the-id-attribute-in-html and https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/id there may be compatibility issues with non alphanumeric strings, but I think in practice that won't be an issue and the only "explicitly bad" string is whitespace
it seems slightly better than token.id = token.text.trim().toLowerCase();
because String.prototype.trim()
only removes spaces at the beginning or end of the string
thanks for looking into this! could you also add a test in https://github.com/mixu/markdown-styles/blob/master/test/integration.test.js
with chinese headings (preferably with the same heading used multiple times) so that if I end up making any additional changes at some later time, the test will catch any potential problems with Chinese characters?
Nice!token.id = token.text.toLowerCase().replace(/\s+/g, '-');
is better.
And I will try to add a test in integration.test.js
.
Hello, I think should use .trim()
removes spaces at the beginning or end of the string first, and then try .replace(/\s+/g, '-')
.
Is it right?
sgtm!
I try to add a test in https://github.com/mixu/markdown-styles/blob/master/test/integration.test.js
,and it can pass the test.
it('reads and renders metadata stored in a file with Chinese characters and the same heading used multiple times', function(done) {
var dir = fixture.dir({
'foo.md': [
'title: 你好 世界',
'author: Anonymous',
'----',
'# 世界因我而不同',
'## 世界因我而不同',
'### 世界因我而不同',
'# 一个世界,一个梦想',
'## 一个世界,一个梦想',
'### 世界因我而不同'
].join('\n')
});
var out = fixture.dirname();
mds.render({
input: dir,
output: out,
layout: layoutDir
}, function() {
assert.equal(fs.readFileSync(out + '/foo.html', 'utf8'), [
'"你好 世界" by Anonymous',
'<h1 id="世界因我而不同">世界因我而不同</h1>',
'<h2 id="世界因我而不同-1">世界因我而不同</h2>',
'<h3 id="世界因我而不同-2">世界因我而不同</h3>',
'<h1 id="一个世界,一个梦想">一个世界,一个梦想</h1>',
'<h2 id="一个世界,一个梦想-1">一个世界,一个梦想</h2>',
'<h3 id="世界因我而不同-3">世界因我而不同</h3>\n'
].join('\n'));
done();
});
});
But the generate link in nav-list
is wrong
great, could you update the PRs and I'll see if I can fix that last issue tomorrow?
Thank you, I have already update the PRs.
replace(/\s+/g, '-')
is wrong convert character .
"angular.directive".replace(/\s+/g, '-')
"angular.directive"
"angular directive".replace(/\s+/g, '-')
"angular-directive"
cool thanks! I did a bunch of cleanup, squashed a few commits and added a test for the TOC generation. It should all be fixed if you install v3.1.9
(e.g. npm install -g markdown-styles@latest
).
thanks again, let me know if you run into any further issues!
So great! It works! Thank you very much.
Chinese title will be resolved as "-"