tomasbjerre / violations-lib

Java library for parsing report files from static code analysis.
Apache License 2.0
148 stars 39 forks source link

Flake8 parser regex too unspecific #184

Closed henglert closed 7 months ago

henglert commented 8 months ago

The flake8 parser uses following regex to scan the output for flake8 issues: "([^:]*):(\\d+)?:?(\\d+)?:? \\[?(\\D+)(\\d*)\\]? (.*)" If there is a match, the row (group 2) is checked for valid int, Code (group 4) and Message (group 5) are not checked to be not empty, but a null message throws an java.lang.NullPointerException: message in Jenkins. Since this regex is very unspecific, I have false positives in my build environment, which break the builds, e.g. even a timestamp is matched with empty message: 2023-11-09 11:52:49 Starting Build 3

Flake 8 build-in formatters use %(path)s:%(row)d:%(col)d: %(code)s %(text)s (Default formatter) and %(path)s:%(row)d: [%(code)s] %(text)s(Pylint formatter)

The "code" will match regex "^[A-Z]{1,3}[0-9]{0,3}$".

To support the 2 built-in formatters I suggest using for example following regex:

"([^:]+):(\\d+):( \\[([A-Z]{1,3})?([0-9]{0,3})\\] (.+)|(\\d+): ([A-Z]{1,3})([0-9]{0,3}) (.+))

with checks of Group 2 as Integer and Group 6 / Group 10 not empty.

tomasbjerre commented 8 months ago

Can you provide an example report?

henglert commented 8 months ago

Here's an obfuscated but complete build report. failed_report_flake8.txt It looks that e.g. line 610 is matched as false positive and throws the NullPointerException, because Message is Null. This output is produced by some MS-build tools, but in other projects I had false positives too. In my opinion the match of "(somestring):(number): (string) (string)" is just too general and will often produce false positives.

henglert commented 8 months ago

At least the check for "Message" not null should be included.

tomasbjerre commented 8 months ago

I tried the report in a test case here: https://github.com/tomasbjerre/violations-lib/tree/feature/issue-184-flake8-regex-too-unspecific

Perhaps you can build on that and supply a PR.

tomasbjerre commented 7 months ago

Released now. Open issue again if not working.