markedjs / marked

A markdown parser and compiler. Built for speed.
https://marked.js.org
Other
33.17k stars 3.39k forks source link

Marked Demo Format Conversion Issue #3525

Open tom-style opened 4 days ago

tom-style commented 4 days ago

Marked version: 14.1.3

Describe the bug The format conversion in the Marked Demo (https://marked.js.org/demo) seems to be incorrect. Specifically, when converting a string containing Chinese punctuation, the resulting HTML is not as expected.

To Reproduce Steps to reproduce the behavior:

  1. Go to the Marked Demo page.
  2. Enter the following string: Marked Demo(https://marked.js.org/demo),Marked Demo.
  3. Observe the converted HTML output.

Expected Output:

html Marked Demo(<a href="https://marked.js.org/demo">https://marked.js.org/demo</a>),Marked Demo

Actual Output:

html Marked Demo(<a href="https://marked.js.org/demo%EF%BC%89%EF%BC%8CMarked">https://marked.js.org/demo),Marked</a> Demo

UziTech commented 4 days ago

As you can see in this demo

marked does the same thing as GitHub.

(https://marked.js.org/demo),Marked Demo.

https://marked.js.org/demo),Marked Demo.

For autolinks we try to follow what GitHub would do. In this case it is working as expected.

If you would like to change marked to not work like GitHub you can create an extension that changes the autolink tokenizer.

Or you can put angle brackets around the link to tell marked where the link should end

(<https://marked.js.org/demo>),Marked Demo.

https://marked.js.org/demo),Marked Demo.

tom-style commented 3 days ago

@UziTech Thank you for your explanation. I understand that Marked tries to mimic GitHub's behavior when handling autolinks. However, in my use case, the input strings can come from various sources, including user inputs and external APIs. This leads to unpredictable behavior, especially when dealing with links that contain commas or other special characters.

Is there a way to ensure that Marked works predictably in all scenarios, without variations due to different input sources? For example, can this be achieved through configuration options or extensions?

Additionally, regarding the method of wrapping links in angle brackets, while it can solve some issues, it is not always feasible to manually add angle brackets to user inputs. Are there any more general solutions available?

Thanks!

UziTech commented 3 days ago

The problem is that there is no "right" way to do it. Everyone's idea of what is "right" is different. So we have to have a specification that tells us the rules in order for everyone to agree on the right way. For example which punctuation stops an auto link is specified in the GFM spec. You don't seem to agree with that spec so you will have to create your own extension with the rules that you want.

I'll be honest this will not be an easy thing to do right as there may be a lot of strange edge cases but you can follow this documentation to learn how to create an extension.