yuin / goldmark

:trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.
MIT License
3.68k stars 255 forks source link

Plural possessives (Typographer extension) #180

Closed brycewray closed 2 years ago

brycewray commented 3 years ago

This issue occurs with the Typographer extension in Hugo — including the most recent release, 0.80.0 — if goldmark is the selected parser. It doesn’t occur if Blackfriday is the selected parser and “smart” punctuation is activated. I have complied with the @yuin requirement for Hugo users to bring up such issues in the Hugo repo before doing so here (https://github.com/gohugoio/hugo/issues/8099).

Consider the following text in a Markdown file:

John's dog is named Sam. The Smiths' dog is named Rover.

Expected result: Each apostrophe should be a “smart”/curly apostrophe (’), given the default behavior of goldmark’s Typographer extension.

Actual result: Only the singular possessive (John's dog) has the “smart”/curly apostrophe, while the plural possessive (Smiths' dog) has the “dumb”/straight apostrophe (').

Thanks in advance for any help or consideration this issue may receive.

thegreatsunra commented 3 years ago

I can confirm that I'm seeing this issue when using plural possessives with the latest version of Hugo (currently 0.80.0).

In my testing, Hugo 0.74.3 (which uses Goldmark 1.1.31) "smartens" plural possessive primes as smart "curly" quotes as expected, but Hugo 0.75.0 (which adopted Goldmark 1.2.1) is where it stopped working.

Hugo 0.80.0 still uses Goldmark 1.2.1. Probably worth checking if this issue is resolved with Goldmark 1.3.1.

brycewray commented 3 years ago

I can confirm that I'm seeing this issue when using plural possessives with the latest version of Hugo (currently 0.80.0).

In my testing, Hugo 0.74.3 (which uses Goldmark 1.1.31) "smartens" plural possessive primes as smart "curly" quotes as expected, but Hugo 0.75.0 (which adopted Goldmark 1.2.1) is where it stopped working.

Hugo 0.80.0 still uses Goldmark 1.2.1. Probably worth checking if this issue is resolved with Goldmark 1.3.1.

Interesting info. Thanks, @thegreatsunra. https://github.com/gohugoio/hugo/issues/8099

thegreatsunra commented 3 years ago

Hugo 0.80.0 still uses Goldmark 1.2.1. Probably worth checking if this issue is resolved with Goldmark 1.3.1.

I cloned Hugo locally, updated ./go.mod to reference github.com/yuin/goldmark v1.3.1 and built the hugo executable from that source.

I created a test page in Markdown with the content John's dog is named Sam. The Smiths' dog is named Rover. and confirmed that, even with Goldmark 1.3.1, the singular possessive is "smartened" while the plural possessive is not.

moorereason commented 3 years ago

The typography extension hasn't changed since v1.1.33 (3c3d448), so Hugo's use of v1.2.1 is likely irrelevant.

thegreatsunra commented 3 years ago

I cloned Hugo locally, updated ./go.mod to reference github.com/yuin/goldmark v1.3.1 and built the hugo executable from that source.

I've gone through each version of Goldmark this way, building and testing the hugo executable, and I'm pretty sure the plural possessive "no longer smartening" regression was introduced in Goldmark 1.1.32.

Goldmark 1.1.31 smartens plural possessives as expected. Goldmark 1.1.32 (and all subsequent versions) does not.

thegreatsunra commented 3 years ago

Ahh, yes: https://github.com/yuin/goldmark/compare/v1.1.31...v1.1.32

With Goldmark 1.1.32 the counter seems to throw things off:

# input
John's dog is named Sam. The Smiths' dog is named Rover.
# output (smartens John's)
John’s dog is named Sam. The Smiths' dog is named Rover.

# input
Johns dog is named Sam. The Smiths' dog is named Rover.
# output (smartens nothing)
Johns dog is named Sam. The Smiths' dog is named Rover.

# input
'Johns dog is named Sam. The Smiths' dog is named Rover.'
## output (smartens 'Johns and Smiths')
‘Johns dog is named Sam. The Smiths’ dog is named Rover.'

# input
'John's dog is named Sam. The Smiths' dog is named Rover.'
## output (smartens 'John's and Smiths')
‘John’s dog is named Sam. The Smiths’ dog is named Rover.'
yuin commented 3 years ago

In Japan, there aren't many opportunities to write English. Honestly say, I'm not familiar with English. I welcome PR from native English writers.

brycewray commented 3 years ago

In Japan, there aren't many opportunities to write English. Honestly say, I'm not familiar with English. I welcome PR from native English writers.

@yuin, great to hear! I am not able to code for a PR but I will create a list of similar typographer issues so whoever can code for a PR can use it as a guideline. Maybe the same code can fix them all.

Edit: Decided the list would be better as an actual site showing the issues, so have put up https://gm-typographer.vercel.app/ as a minimal example.

brycewray commented 3 years ago

FWIW, Hugo 0.81.0 was just released, which updates the Goldmark version to 1.1.32; so I guess it'll be a while before this gets resolved.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

brycewray commented 3 years ago

To my knowledge, the problem remains.

brycewray commented 3 years ago

Hugo 0.82.0 is out. Since the release notes don't mention Goldmark at all, presumably it's still using 1.1.32. Just FYI.

brycewray commented 3 years ago

Hugo 0.83.1 is out with a bug fix for 0.83.0, which was released the day before. The release notes for 0.83.0 say it now uses Goldmark 1.35; this issue still persists (see https://gm-typographer.vercel.app/).

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

skyfaller commented 3 years ago

Can we please get rid of the stale bot? I'm fine with commenting to prevent the issue from going stale, but it seems like unnecessary noise if there has been no change in the situation.

For the record, I am still experiencing this issue in Hugo v0.83.1. Here's my (possibly inferior) version of the test page: https://www.maximumethics.dev/blog/2021/03/smart-quotes/

brycewray commented 3 years ago

Issue is still present in Hugo 0.84.0, which uses Goldmark 1.3.8.

brycewray commented 3 years ago

Issue is still present in Hugo 0.85.0, which uses Goldmark 1.3.9.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

skyfaller commented 3 years ago

This issue is still present in Hugo 0.87.0, which uses Goldmark v1.4.0.

I still hate stale bot.

brycewray commented 3 years ago

@skyfaller at this point, I am convinced this is not going to be resolved. What complicates that is the fact that Hugo 0.87.0 deprecated Blackfriday so it's pretty much Goldmark or nothing going forward.

skyfaller commented 3 years ago

I posted a plea for help on the Hugo forum, maybe that will attract someone with the necessary coding chops: https://discourse.gohugo.io/t/smart-quotes-do-not-work-properly-in-hugo-with-goldmark/34266

natemoo-re commented 3 years ago

👋🏻 Hey folks! I'm maintaining a WASM wrapper of Goldmark for Deno (https://deno.land/x/goldmark). I got some feedback from @brycewray that this was a blocker, so I'm going to try to take a crack at a PR!

brycewray commented 3 years ago

Still present in Hugo 0.89.0, which uses Goldmark 1.4.2 (https://github.com/gohugoio/hugo/releases/tag/v0.89.0).

brycewray commented 2 years ago

Still present in Hugo 0.89.3, which uses Goldmark 1.4.3.

Updated demo.

brycewray commented 2 years ago

@yuin Thank you so much for pinning this, removing the "Stale" label, and setting a "Help wanted" label!

brycewray commented 2 years ago

Hugo 0.93.0 with goldmark 1.4.7 definitely fixed all issues I’d previously reported; see the newly updated version of my test site. Outstanding!

arif254 commented 2 years ago

I use Hugo and still get straight quotes.. Segoe UI is to blame.

brycewray commented 2 years ago

I use Hugo and still get straight quotes.. Segoe UI is to blame.

That's odd. That font definitely has all the correct characters.