ekalinin / github-markdown-toc

Easy TOC creation for GitHub README.md
MIT License
3.23k stars 2.75k forks source link

If the title is Chinese Jump is not supported after toc is generated #77

Closed yangchongduo closed 3 years ago

ekalinin commented 5 years ago

Could you provide an example?

counter2015 commented 5 years ago

@yangchongduo you can urlencode the chinese text as toc tag. For example :

$  echo '标题' | tr -d '\n' | xxd -plain | sed 's/\(..\)/%\1/g'
%e6%a0%87%e9%a2%98

Then in markdown, your tag may be like this

* [标题](#%e6%a0%87%e9%a2%98)
counter2015 commented 5 years ago

BTW, I just test Chinese title in my environment, it works fine without url encoding.

Maybe it's related to your environment ?

My enviroment: win10, typora

alexmojaki commented 3 years ago

Example of a TOC I tried generating that didn't work: https://github.com/flasgger/flasgger/blob/ba6ca6233611378704d490b1fa6045d5e803b150/README.zh.md

The TOC has weird encoding like this:

* [高度参与的贡献者](#\xE9\xAB\x98\xE5\xBA\xA6\xE5\x8F\x82\xE4\xB8\x8E\xE7\x9A\x84\xE8\xB4\xA1\xE7\x8C\xAE\xE8\x80\x85)

Clicking on it does nothing.

If I click the link icon next to a header, the anchor just contains the Chinese characters directly, no encoding.

I'm on Chrome on Ubuntu.

duyanghao commented 3 years ago

@ekalinin same problem appears on my MacBook.

ekalinin commented 3 years ago

@ekalinin same problem appears on my MacBook.

Hey @duyanghao! Thanks for report.

Could you provide an example? (What is the source README? What TOC did you get? What TOC did you expect?)

duyanghao commented 3 years ago

@ekalinin same problem appears on my MacBook.

Hey @duyanghao! Thanks for report.

Could you provide an example? (What is the source README? What TOC did you get? What TOC did you expect?)

Q1: What is the source README?

## 这是一个例子

这是段落内容

Q2: What TOC did you get?

Table of Contents
=================

      * [这是一个例子](#\xE8\xBF\x99\xE6\x98\xAF\xE4\xB8\x80\xE4\xB8\xAA\xE4\xBE\x8B\xE5\xAD\x90)

Created by [gh-md-toc](https://github.com/ekalinin/github-markdown-toc)

Q3: What TOC did you expect?

Table of Contents
=================

      * [这是一个例子](#这是一个例子)
zeitounator commented 3 years ago

Note: if you feel like the following is not related, please just let me know and I will create a separate and complete bug report.


Unless I missed something, the problem is now exactly the same with any title containing special chars in any language (accents, cedillas, tilde, etc....). And this used to work without problems in older versions. I have no idea when the problem was introduced. I don't have time now to have a look but I will follow this ticket and may dig into it if possible/needed.

The only info I can give right now: I generated a toc 3 month ago that was correct. In between I upgraded gh-md-toc at least twice and the generation on same tiltles this morning is broken.

Abridged example:

Titles in my file (this is in French, all my files are UTF-8):

# Aperçu
# Prérequis

Toc generated back then:

* [Aperçu](#aperçu)
* [Prérequis](#prérequis)

Toc generated today:

* [Aperçu](#aper\xC3\xA7u)
* [Prérequis](#pr\xC3\xA9requis)

Cheers.

zeitounator commented 3 years ago

I was able to validate we all suffer from the same problem: utf-8 encoded chars in href returned from google makdown api call are not decoded anymore.

With a rapid empiric tests, I noticed that version 0.6.0 was working correctly and with git bisect, I was able to trace the problem down to commit 495dfb3ebaf6db4c9c7fbea2e5266f43a785b40c which is related to #104.

I'm currently looking at the part that was added to awk script to see how to fix it. I'll push a PR if I find a solution.