fsfe / reuse-tool

reuse is a tool for compliance with the REUSE recommendations.
https://reuse.software
376 stars 144 forks source link

feat: lint output per line #956

Closed nicorikken closed 2 months ago

nicorikken commented 4 months ago

Adds an '-line' or '-l' option to the 'lint' command. Prints a line for each error, starting with the file to which the error belongs.

This output can be a starting point for some parsers, in particular ones that implement something similar to Vim errorformat parsing.

Related to https://github.com/fsfe/reuse-tool/issues/925, needed for #578

Additional work needed:

Signed-off-by: Nico Rikken nico@nicorikken.eu

nicorikken commented 4 months ago

Current version works based on ProjectReport to ensure also license errors are included. A structural improvement might be to create structural reports for license errors, even as FileReports. And for file reports ideally the line number is kept to attribute errors to lines where possible.

nicorikken commented 4 months ago

Two remaining suggestions for the future:

Current output:

/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py: bad license: invalid
LICENSES/invalid-license-text: bad license: invalid-license-text
LICENSES/Nokia-Qt-exception-1.1.txt: deprecated license
LICENSES/MIT: license without file extension
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py: missing license: invalid
LICENSES/MIT: unused license
LICENSES/Nokia-Qt-exception-1.1.txt: unused license
LICENSES/invalid-license-text: unused license
/tmp/pytest-of-nico/pytest-44/fake_repository14/restricted.py: read error
/tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/no-license.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py: without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py: without copyright

Note: I have now committed a change to remove colons from the error messages to aid in parsing.

nicorikken commented 4 months ago

I'm now trying to validate with the errorformat library: errorformat "%f: %m" It seems it cannot succesfully parse the output of files with spaces and colons:

/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py|| bad license invalid
LICENSES/invalid-license-text|| bad license: invalid-license-text
LICENSES/Nokia-Qt-exception-1.1.txt|| deprecated license
LICENSES/MIT|| license without file extension
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py|| missing license invalid
LICENSES/MIT|| unused license
LICENSES/Nokia-Qt-exception-1.1.txt|| unused license
LICENSES/invalid-license-text|| unused license
/tmp/pytest-of-nico/pytest-44/fake_repository14/restricted.py|| read error
|| /tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py|| without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/no-license.py|| without license
|| /tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py|| without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py|| without copyright

So probably filenames will have to be escaped.

I tried rendering output for SARIF, but it seems that it requires a built-in format and that all compatible built-in formats require line numbers and column numbers.

nicorikken commented 4 months ago

I originally had a test with a filename with colons, but this was causing errors on Windows. I removed it.

nicorikken commented 4 months ago

The issue of using files with spaces seems to be an issue with the errorformat library: https://github.com/reviewdog/errorformat/issues/97 I don't think reuse-tool needs to escape filename output as there is already a clear separator in this output. This leaves the edge-case of filenames with colons, but this is also rare in a development environment that will have to cater to Windows which doesn't allow colons in filenames.

carmenbianca commented 3 months ago

I also spent a while looking at escaping options. Thank you @nicorikken for your analysis. I think there are only two characters that really concern us here:

I gave it a spin with Pylint to see how other tools handle it. Here's the result:

$ pylint src/reuse
************* Module reuse.hello
world
src/reuse/hello
world.py:1:0: C0103: Module name "hello
world" doesn't conform to snake_case naming style (invalid-name)
************* Module reuse.hello:world
src/reuse/hello:world.py:1:0: C0103: Module name "hello:world" doesn't conform to snake_case naming style (invalid-name)

The answer is: don't.

I think we can be lazy about this and not bother.