DavidAnson / markdownlint

A Node.js style checker and lint tool for Markdown/CommonMark files.
MIT License
4.81k stars 733 forks source link

MD038 triggered when using | in code block inside a table #798

Closed pascalberger closed 1 year ago

pascalberger commented 1 year ago

Since version 0.28.0 (markdownlint-cli 0.34.0) the following markup throws an MD038 warning:

| |
| - |
|`a |` `foo`|

If the | in the first code block is removed the warning is no longer reported.

DavidAnson commented 1 year ago

Here’s that exact content as rendered by GitHub:

a | foo

Yours is a fairly extreme/invalid scenario, but you can see above that the cell with the “a“ ends at the pipe character so the following two backtactics create a code span with a space in it and the lint warning is correct. (It seems the extra table cell is not rendered, but that’s another story.)

Is this a simplification of a real scenario, or is this more of a curiosity?

pascalberger commented 1 year ago

Is this a simplification of a real scenario, or is this more of a curiosity?

I got this warning after updating to markdownlint-cli 0.34.0 on a real docs site. The actual content of the code spans are regular expressions, with the first containing a pipe character.

DavidAnson commented 1 year ago

This rule is using is a different parser, micromark, but I think that is probably consistent about how it is parsing the table. If you want to point me at the original Markdown, I can have a look. However, my guess is that table is not rendering as intended on GitHub.

DavidAnson commented 1 year ago

FYI: https://github.com/micromark/micromark-extension-gfm-table#escaped-pipes-in-code

pascalberger commented 1 year ago

This rule is using is a different parser, micromark, but I think that is probably consistent about how it is parsing the table. If you want to point me at the original Markdown, I can have a look. However, my guess is that table is not rendering as intended on GitHub.

Original Markdown unfortunately is in a private repo. Also not rendered on GitHub, but by Statiq which uses Markdig under the hood, and rendering works there.

DavidAnson commented 1 year ago

Okay. My policy is CommonMark specification for core functionality and GitHub Flavored Markdown specification for tables and such. If one of the parsers has a bug, I can report it to them. If my code has a bug, I can try to fix it. But if there's a specification dispute, I probably can't help. In this case, the parsing of the table seems consistent with GitHub, so I think the warning is legitimate. If you can give a more detailed example, maybe we can find a way to make it work?

Ravlen commented 1 year ago

@DavidAnson We ran into the same problem when trying to update recently, see this page: https://docs.gitlab.com/16.0/runner/configuration/advanced-configuration.html#the-runnersdocker-section

The volumes_from row in that table has a | in backticks (<container name>[:<ro|rw>]), which renders fine with our renderer (Nanoc), but like above got flagged by the latest version of markdownlint. We used double backticks to work around it, which is fine, but it does feel like a bug if it's broken with single backticks, but fine with double backticks.

DavidAnson commented 1 year ago

@Ravlen, can you please show an example to reproduce this? I can't find the Markdown for that page in the repo linked at the bottom of it. Per the parser docs, unescaped pipes are not allowed in a table cell, so the GFM rendering of your original content is probably not what you wanted. https://github.com/micromark/micromark-extension-gfm-table#syntax

Ravlen commented 1 year ago

@DavidAnson Yeah, it's not ideal, but I felt it was worth pointing out if something could be done about it. Like GitHub, GitLab struggles with it too, though we don't worry about that too much because our renderer for docs.gitlab.com (Nanoc) is happy with it.

DavidAnson commented 1 year ago

The reason double backticks didn't warn is probably because that approach doesn't create an unmatched backtick in the right-most cell after the embedded pipe character splits the cell. The double backticks match with each other instead of a single backtick trying to match with the first backtick of the next code span ("ro").

I don't think there's anything for me to do here as we're still discussing an improperly split table and escaping the pipe character gives the desired behavior and no linting warnings.

For my own benefit, here's a repro with just those two lines: https://dlaa.me/markdownlint/#%25m%23%20Issue%20798%0A%0A%7C%20Parameter%20%7C%20Description%20%7C%0A%7C%20---------%20%7C%20-----------%20%7C%0A%7C%20%60volumes_from%60%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7C%20A%20list%20of%20volumes%20to%20inherit%20from%20another%20container%20in%20the%20form%20%60%3Ccontainer%20name%3E%5B%3A%3Cro%7Crw%3E%5D%60.%20Access%20level%20defaults%20to%20read-write%2C%20but%20can%20be%20manually%20set%20to%20%60ro%60%20(read-only)%20or%20%60rw%60%20(read-write).%20%7C%0A%7C%20%60volumes_from%60%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7C%20A%20list%20of%20volumes%20to%20inherit%20from%20another%20container%20in%20the%20form%20%60%60%3Ccontainer%20name%3E%5B%3A%3Cro%7Crw%3E%5D%60%60.%20Access%20level%20defaults%20to%20read-write%2C%20but%20can%20be%20manually%20set%20to%20%60ro%60%20(read-only)%20or%20%60rw%60%20(read-write).%20%7C%0A