decaporg / decap-cms

A Git-based CMS for Static Site Generators
https://decapcms.org
MIT License
17.81k stars 3.04k forks source link

Netlify CMS Open Authoring makes changes to the markdown and create lots of diffs in a pull request #4537

Closed andreamussap closed 1 year ago

andreamussap commented 3 years ago

Describe the bug

Netlify CMS Open Authoring makes changes to the markdown and create lots of diffs in a pull request, making it impossible to review.

To Reproduce

We've used Blackfriday markdown parser with Hugo on our project https://github.com/Axway/axway-open-docs , and we used to have issues with the diff of pull requests created via Netlify CMS: the contributor would change one line in the text, but the diff would show lots of changes.

After Hugo moved to Goldmark as their default markdown parser, we changed the parser from Blackfriday to Goldmark in our project as well (see PR#1047) to be consistent with Hugo and to see if this would fix the issue with the diffs which weren't created by an actual editing of the page.

After that, the problem with the diffs improved a little bit in pages which had been changed via the CMS at some point. However, this isn't ideal. Ideally the CMS shouldn't change anything. For example, on https://github.com/Axway/axway-open-docs/pull/1357/files the contributor didn't change lines 349, 350, 375. I know that because I'm aware of this issue, but other reviewers are getting confused with these 'fake' diffs.

CMS_changes

As for new pages, or pages which were not edited via Netlify CMS yet, this is still a big issue when you edit them on the CMS for the first time. For example: https://github.com/Axway/axway-open-docs/pull/1394/files

It's hard to review a PR when you have a diff like that because you don't know what to look at. I've created this PR so I know that the only change is on line 133 to add a bold tag and split a word in two ("Specifies the maximum number of recent projects to list on the Recent Projects tab in Policy Studio."), but for a Reviewer is impossible to know what the contributor actually changed.

Expected behavior

Netlify CMS shouldn't change the markdown in the files.

Screenshots

N/A

Applicable Versions: Netlify CMS version: 2.13.2 Git provider: GitHub OS: Windows 10 Browser version: Chrome Version 86.0.4240.183 (Official Build) (64-bit)

CMS configuration

https://github.com/Axway/axway-open-docs/blob/master/static/admin/config.js

Additional context

N/A

andreamussap commented 3 years ago

@erezrokah Thank you for labeling this. Just one more example of an unnecessary change made by the CMS. See, PR#1393, line 18.

Short codes in Docsy are created using {{% alert %}}. CMS is changing this to {{< /alert >}} :(

PR1393_cmsChangesDocsyCode

andreamussap commented 3 years ago

Hi @erezrokah is there any plan to work on this issue?

I've just got a pull request on my project https://github.com/Axway/axway-open-docs/pull/1605, where everything looked fine on Netlify CMS preview (right-side preview of the CMS) while the user was updating the page.

However, after the PR was created, when I looked at the deploy preview, part of the content was totally unformatted: DeployPreview_show_brokenparagraph

This was caused by the CMS, because it automatically (!??) changed the formatting of the markdown text: DeployPreview_cmsautochanges

A PR that should be easy, quickly to review took so much more time than necessary because I had to "fix" the formatting that the CMS had changed. Worse than that, this will very likely happen again the next time that this page is update, I'll have the same issue, and I'll have to "fix" the format again, and again, every time that the page is update.

This is just one more example that I collected today, but unfortunately this issue - CMS automatically changing formatting in the pages - is quite frequent.

Another problem that this causes to me is that sometimes I can't find what the user actually changed in the page due to so many formatting changes made by the CMS.

Hence, would it be possible for you to prioritize this fix? Thanks a mil. Andrea.

erquhart commented 3 years ago

Hi @andreamussap - a bit of explanation:

The changes are happening because markdown parsers generally don't know exactly how the Markdown input looks. They know, for example, that a section of content represents a list item, but they don't know whether the list item used a - or a *. The technical reason is that, until very recently, JavaScript Markdown parsers used abstract syntax trees rather than concrete syntax trees.

The good news is, these mass changes only happen the first time a document that was not created by the CMS is parsed, as the CMS will output things the same way in subsequent edits.

The level of overhaul required to avoid this behavior is non-trivial, and only recently made possible through new open source tech. I don't believe it's something we'll have capacity to prioritize in the near future. I do hope that the scope of this issue being limited to initial edits of manually created/edited docs eases the pain a bit. As more and more pages are edited, this behavior should eventually subside.

The specific issue you just shared around the broken code block formatting seems not to be a problem with the Markdown that is being output, but in how the static site generator is parsing and displaying it. If you paste the output into the CommonMark parser (CommonMark being the only actual spec for Markdown), the code blocks display correctly: https://spec.commonmark.org/dingus/

I hope this has been in some way helpful! If I can provide any other clarity in any other way, please let me know.

martinjagodic commented 1 year ago

Closing as stale