alan-if / alan-docs

Alan IF Documentation Project
https://git.io/alan-docs
Other
4 stars 0 forks source link

EditorConfig: Known Issues and Workarounds #73

Open tajmone opened 4 years ago

tajmone commented 4 years ago

[WIP] — This Issue will be updated in due time.

This Issue is a memorandum about the limitations of EditorConfig and known bugs of EClint, presenting various tips & tricks on how to work around them to be able to enforce code style consistency across editor and validate code styles of commits and pull requests.

The main post will be updated to provide information completeness, and comments will be used to discuss the topic, provide examples and ask questions.

Maybe, in the future, this could become a full-fledged article in the repository Wiki, as part of the developers' and contributors' documentation.

(To be continued...)


Enforcing Styles vs Validating Them

One of the biggest problems we have to tackle with is reconciling enforcing code styles on editors/IDEs (which is the main purpose of EditorConfig) with the task of validating code styles consistency in commits and PRs — a dual task which often requires some compromises that are detrimental to the former task.

EditorConfig doesn't provide separation of settings for editor enforcement and validation tools, which has some practical implications on repositories. The following considerations are far from being a complete list of such problems, but will tackle with those that most directly affect this repository and its file types.

Indentation Issues

For example, indentation. Let's take an ALAN source adventure; usually we'll want to set the default indentation to 2 spaces, but in some places we'll end up adding an extra indentation space just to align in a more pleasant way some parts of code, eg:

DESCRIPTION
  "It's a huge cave.
   Spider webs cover every wall."

in the above example the string spans across multiple lines, and it's nicer to align the text on the second line with the text of the previous one, instead of aligning it with the quotes character.

Also, ALAN authors tend to use a free-format approach to how long lines are handled, e.g.:

DESCRIPTION "It's a huge cave.
            Spider webs cover every wall."

in the above example, the second line has an odd alignment, which would cause an error report if indentation was set to 2 space, for it's not a multiple of 2.

For the above reasons, it's better not to specify the indentation size, to allow any indentation level:

[*.{alan,i,a3sol}]
indent_style = space
indent_size = unset

Similarly, this affects syntaxes like Markdown, where the recommended indentation level should keep in account the fact that markdown documents often contain source code blocks, which have their own indentation according to the language of the block. So, if on the one hand we'd wish to provide end users with a consistent indentation style for markdown elements (lists, etc.), on the other hand we need to prevent validation failure due to code blocks violating the .editorconfig settings.

Since there's currently no solution to this problem, all we can do is compromise, favoring passing EClint validation tests on Travis CI, at the cost of renouncing enforcement of consistent indentation styles in those file types where there can't be a strict usage of indentation rules.

The downside is that we'll be failing to provide a basic setting for indentation consistency across editors — the ideal solution would have been to have separate settings for the editor (two space, recommended) and the validating tool (any number of spaces).

Unfortunately, the EditorConfig specifications are quite rigid and unwilling to take into account similar needs of everyday scenarios, favoring instead a "purist" approach.

The practical consequences might vary from editor to editor, where some editors allow manually overriding the .editorconfig settings, while others will abide more strictly to them.

Lack of native EOL Support

Another example is the refusal by EditorConfig to accept native as a valid end_of_line value, which forces cross-platform projects that want to preserve native EOLs to unset the EOL configuration in .editorconfig, and rely on .gitattributes instead.

Although EditorConfig is widely used by Git users, this feature request has been turned down many times over — the main argument being that "native" is a vague definition for EOL type (although it seems to have been good enough for the Linux Kernel devs, which included it in Git).


About EClint

EClint is an EditorConfig validation tool that will pick up the setting from the .editorconfig file of the repository and check that all the project files abide to the established code conventions.

EClint Bugs

EClint is known to contain loads of unsolved bugs, and the application hasn't been updated since 2018. It's maintainer even went as far as stating:

I have zero passion in this project, because I don't think I should have ever created it in the first place. Quite honestly, I don't think anyone should be using it […]

Although it's not the only EditorConfig validation tool available, the other ones that I've tried are even buggier. So, for know will have to use EClint, via some hacks and workarounds.

There are two types of EClint bugs that we have to deal with:

  1. Failure to detect code styles violations.
  2. False positive reports.

The former is not a huge deal — we can assume that the contributed code is well formatted if the contributor is using an editor that supports EditorConfig, and at least it doesn't break the Travis CI build report.

The latter is more problematic, because it will trigger a build failure on Travis CI. When dealing with false positives, the current workaround is to exclude on the culprit files the feature triggering the false-positive, by unsetting the problematic rule(s) in .editorconfig for those specific files (see Broken Latin1 Validation below).

Failure to Validate Some File Types

One of the most notable bugs is that it often fails to validate certain file types, reporting that everything is fine while, in reality, these files violate the established conventions.

I've never quite understood why this happens, but it seems to be related to EClint failing to parse certain contents, which might contain certain characters sequences that breaks the file validation (this occurs often with XML files, so my guess is that it might be due to poor strings sanitation).

Broken Latin1 Validation

Alan sources need to be encoded in ISO-8859-1, but EClint supports poorly the charset = latin1 setting, reporting false character out of latin1 range: warnings for valid characters out of the ASCII range (e.g., accented letters, the © symbol, etc.).

When Alan sources contain special characters beyond the ASCII range, it's better to just override the encoding setting for those files (charset = unset) so that encoding validation will be skipped on them:

# Common settings for all Alan files:

[*.{alan,i,a3sol}]
indent_style = space
indent_size = unset
charset = latin1
trim_trailing_whitespace = true
insert_final_newline = true

; Skip encoding validation on ALAN sources containing non-ASCII chars:

[alanguide/alanguide-code/tvtime.alan]
charset = unset

[_dev/styles-tests/{demo,snippets}.alan]
charset = unset

Note that the remaining code styles rules will still be checked on them, only encoding validation won't.