ardumont / markdown-toc

Generate a TOC in markdown file
GNU General Public License v3.0
127 stars 102 forks source link

`markdown-toc--to-link` need remove the un-ascii char #58

Closed ccqpein closed 5 days ago

ccqpein commented 9 months ago

bug?

I think so.

M-x markdown-toc-bug-report

markdown-toc - Please:
- Describe your problem with clarity and conciceness (cf. https://www.gnu.org/software/emacs/manual/html_node/emacs/Understanding-Bug-Reporting.html)
- Explicit your installation choice (melpa, marmalade, el-get, tarball, git clone...).
- Report the following message trace inside your issue.

System information:
- system-type: darwin
- locale-coding-system: utf-8-unix
- emacs-version: GNU Emacs 30.0.50 (build 1, aarch64-apple-darwin23.3.0, NS appkit-2487.40 Version 14.3.1 (Build 23D60))
 of 2024-02-10
- markdown-mode path: /Users/user/.emacs.d/straight/build/markdown-mode/markdown-mode.el
- markdown-toc version: 0.1.5
- markdown-toc path: /Users/user/.emacs.d/straight/build/markdown-toc/markdown-toc.el

Expected behavior

When I run M-x markdown-toc-generate-toc, one of section title and link should be

- [[hello · world](#hello--world)]

Because the · isn't the ascii char which isn't supported by github readme render and should be replaced by empty string.

Actual behavior

- [hello · world](#hello-·-world)

There is · inside the link which it shouldn't be there.

Steps to reproduce the behavior

  1. making one section like ### hello · world
  2. M-x markdown-toc-generate-toc or M-x markdown-toc-refresh-toc
  3. check the TOC part

Solution

I tried to figure out what's going on. I was running the markdown-toc--to-link "hello · world" in *scratch* buffer. And it gives me right link.

But in my markdown file, it returns the -·-.

After several tries, I find the (replace-regexp-in-string "[[:punct:]]" "") inside markdown-toc--to-link function has the inconsistent behaviors in markdown file and my *scratch* buffer.

After some googling, in emacs doc actually says:

This matches any punctuation character. (At present, for multibyte characters, it matches anything that has non-word syntax, and thus its exact definition can vary from one major mode to another, since the syntax of a character depends on the major mode.)

So I guess in markdown mode, · isn't the :punct:.

Then I change the regex in markdown-toc--to-link from (replace-regexp-in-string "[[:punct:]]" "") to (replace-regexp-in-string "[[:punct:][:nonascii:]]" ""). And it worked.