jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.63k stars 3.38k forks source link

Improve EditorConfig Settings #6520

Open tajmone opened 4 years ago

tajmone commented 4 years ago

The current .editorconfig settings are enforicing LF EOLs on all files, which conflicts with Git defaults (core.eol = native) which normalizes all text files to the native EOL, which under Windows means that all text files should be checked out using CRLF. If end users have Git configured to core.autocrlf = true (recommended) they won't be able to commit edited files, due to EditorConfig settings enforcing LF at savetime. (see #6517)

I propose to adjust the .editorconfig and add rules for each file type, instead of using the * pattern, and use native EOL for text files that allow it, and only enforce LF or CRLF on extensions that strictly need them (e.g. make files, batch scripts, etc.).

I use EditorConfig settings in most of my repositories to enforce code-consistency validation of every commit and PR (via Travis CI, using EClint), so if it's OK with everyone I'd like to add code-styles validation to the repository. Here's an example of a repository using EditorConfig validation:

But the current .editorconfig/.gitattributes settings are far from ideal and need some extra attention if we want to also add code validation via Continuous Integration.

tarleb commented 4 years ago

This is nice, especially the automated tests. We might want to make a distinction between source code, documentation, and test files. We should be strict about whitespace in the latter, but can probably be more relaxed about the others.

tajmone commented 4 years ago

Sure, we can add different rules for certain folders.

Do the markdown sources (documentation) in this repo by any change make use of trailing whitespaces at line-end for hard-line breaks? If not, we could enforce trimming trailing spaces (unless some code blocks inside the docs have indent-only lines).

alerque commented 4 years ago

@tajmone Even if none of the Markdown currently uses the trailing space thing, I don't think any attempt should be made to strip something that could be syntactically meaningful to the language. A single trailing space could be removed, but two means something to markup and should never be removed.

Similarly indentation shouldn't me messed with in Markdown (there is reason to vary indentation in nested lists for example that doesn't play nicely with hard and fast coding rules).

tajmone commented 4 years ago

OK, this afternoon I'll create a dedicate branch and start adding one file type at the time, so I can check the EClint validation report and fine-tune the settings. I'll use Travis CI for validation, so the jobs get run in parallel to the current Circle CI jobs, which should be faster and also keep separate the two different builds.

tajmone commented 4 years ago

I've started working on the new EditorConfig setting on a dev branch on my fork:

https://github.com/tajmone/pandoc/tree/EditorConfig

I've started by commenting out the original settings and then adding the settings of one file type at the time, keeping fixes to the sources (e.g. indentation adjustments) in separate commits.

I've also enabled Travis CI validation of all commits (via EClint, using a custom script validate.sh) so it's possible to see how it actually work:

https://travis-ci.com/github/tajmone/pandoc/builds/175268673

Right now, I've started working on those file types whose settings are well established, postponing the more difficult file types which might require some discussions.

Since the repository contains both GFM and pandoc markdown sources, I would suggest adopting the .md extension only for the former type, and .markdown only for the latter — currently the .markdown extension seems to be used only for pandoc markdown sources, whereas the .md extension is used for both GFM and pandoc sources. Having a clear separation between the two would allow not only to target each format within the definition rules, but would also be clearer on end users (IMO).