codeclimate-community / codeclimate-markdownlint

8 stars 3 forks source link

Bug: Atom users - trailing spaces #22

Open efueger opened 7 years ago

efueger commented 7 years ago

(not sure that action is needed, posting in case anyone else runs into this behavior)

Customer reported that he found the markdownlint engine reports trailing spaces if any of these are at the end of a line:

He said:

After a lot of trial and error, I have found that this issue has something to do with the Atom text editor. I found that I am able to create a new markdown file using the SublimeText text editor and that file will not report these errors from the markdownlint engine, but if i create the exact same file from scratch using the Atom text editor then the markdownlint engine reports these errors.


More details:

I recently started trying a new editor (Atom) and it turns out that the new editor automatically inserts CRLF (\r\n) line endings, instead of just LF (\n) as I was used to.

I looked into the markdownlint engine and found that for the "trailing white space" rule, this is how they're finding those trailing white spaces:

So, they're using a regular expression here, /\s$/ The \s is for "whitespace", and the $ is for "at the end"

So I looked into how regex defines "whitespace", and it can depends on the "flavor" of regex (in this case, the flavor is Ruby).

They define a whitespace character like this:

  • /[ \t\r\n\f]/

So, they include both carriage returns (\r) and line feeds (\n).

So then I wondered why wasn't I seeing the "Trailing space" errors for the line feed characters, only when there is both a carriage return and a line feed together?

I looked into the markdownlint engine again, and it looks like the code that represents a markdown document is here (https://github.com/markdownlint/markdownlint/blob/master/lib/mdl/doc.rb) . In that code, they take the markdown source file and use the Ruby String.split (https://ruby-doc.org/core-2.2.0/String.html#method-i-split) method, like this:

When they split up the markdown source file into lines based on line feed characters, it effectively removes line feeds from being analyzed, and only analyze the stuff between the line feeds (between the \n occurrences). So this is why a line feed (\n) doesn't cause a "Trailing spaces" error, but a carriage return (\r) does.