codespell-project / codespell

check code for common misspellings
GNU General Public License v2.0
1.84k stars 470 forks source link

Splitting multiple 'skip' paths onto indented config file lines behaves differently than when on a single line #3399

Closed oddhack closed 4 months ago

oddhack commented 4 months ago

The INI file format claims to support an option value that is split across multiple lines, if the lines after the first are indented. However, I see different behavior with the two settings

skip=*.adoc,./build_tests/*,
skip=*.adoc,
    ./build_tests/*,

The first works (skips a file with misspellings under build_tests/BAD), the second fails (reports the misspelling). Am I missing something about the INI file format that would allow this to work?

Attaching a zipfile with this minimal example.

codespell-config.zip

DimitriPapadopoulos commented 4 months ago

As far as I know, INI files lack a unified standard, and we rely on a Python library to read them. I'll have a look though, it might be just a matter of passing an option to that library.

Meanwhile, where exactly have you read the claim to support an option value that is split across multiple lines?

From the configparser documentation:

Values can also span multiple lines, as long as they are indented deeper than the first line of the value.

DimitriPapadopoulos commented 4 months ago

In the multi-line case, configparser.ConfigParser.read keeps the newline. After adding this debug line:

    print(config["codespell"]["skip"])

right after: https://github.com/codespell-project/codespell/blob/69c0f002e9e43ed92f31fa19e41bde5a96b065a2/codespell_lib/_codespell.py#L649 I see a comma-separated list of directories in the single-line case:

*.adoc,./build_tests/*,

while in the multi-line case, the newline is kept in addition to the commas:

*.adoc,
./build_tests

We need to discard the newline at some point.

oddhack commented 4 months ago

The documentation is a little vague on whitespace handling in INI files. This seems to be a case of the standard being defined by the implementation.

I wouldn't want to cause a bunch of work to deal with this, fairly fringe case (we have a lot of skipped files in our repo and it's helpful to group them on different lines for readability). So feel free to close this WONTFIX unless you want to pursue it - thanks!

Edit: I was actually hoping I'd misunderstood something about how the multiline values worked that I could fix on my end :-(

DimitriPapadopoulos commented 4 months ago

The fix is simple (#3400), and is very much related to the fix for #2727 (#2767).

I just need to find the time to write tests before #2767 and #3400 can be merged.