JossWhittle / FlintPlusPlus

Flint++ is cross-platform, zero-dependency port of flint, a lint program for C++ developed and used at Facebook.
Boost Software License 1.0
266 stars 21 forks source link

Implement checkExceptionInheritance #38

Closed kanielc closed 10 years ago

kanielc commented 10 years ago

Directly navigated the tokens (didn't use the structures of classes/structs/unions, as I saw that after).

Seemed simple enough navigating directly I guess. Also updated the test file (now that we've sorted it.

JossWhittle commented 10 years ago

I've made a small modification as I merged so the check uses the pre identified structures list, see 5d57477db32f2671b1197de861c60479934cfcda.

The performance benefit is likely negligible because of cache coherency as we discussed before, but if we already have that information readily available there's no need to scan over the whole stream in search for them.

JossWhittle commented 10 years ago

I imagine there may be significant performance boosts for some of the deeper darker files within Boost where the source files are several megabytes in size and contain hundreds of thousands of tokens.

kanielc commented 10 years ago

Yeah, I realized it made sense when I noticed the versions taking the structure, but was too tired to keep going ;)

JossWhittle commented 10 years ago

That's fair enough it was a simple change of the outer loop given how you used the nice and clean std::find functions.

Just timed the linting Boost and got an average of 32.190 seconds using the structures lists vs. 35.147 without so there is a couple of seconds squeezed out of the swap.

I noticed as the linter was going two outputs to stderr, I was wondering if you could confirm that you get them too. I'm going to have a little hunt through boost now to see if it's obvious why the tokenizer fails for these two files.

Exception thrown during checks on .\boost\spirit\home\classic\utility\rule_parser.hpp.
Misplaced backslash in .\boost\spirit\home\classic\utility\rule_parser.hpp:422

Exception thrown during checks on .\boost\type_erasure\tuple.hpp.
Invalid character: ` in .\boost\type_erasure\tuple.hpp:571
kanielc commented 10 years ago

Yeah, I get those exceptions too.

JossWhittle commented 10 years ago

The error in rule_parser appears to be because of horrendous abuse of preprocessor directives... I can't imagine anyone else in the right minds would be trying to lint this style of code.

JossWhittle commented 10 years ago

Again the error in tuple.hpp seems to be related to preprocessor macros. This time where a macro parameter is a piece of syntax, a back-tick.

Perhaps these single types of single character token errors could be prevented if we introduced a new token type TK_UNEXPECTED which is placed into the stream, then allowing parsing to continue. We could still output to stderr, but as a heads up rather than a full blown exception.

kanielc commented 10 years ago

That could work, it may cause us to miss a few cases of sequences matching some valid pattern (if the UNEXPECTED was ignored), but seeing that we wouldn't have caught those anyway, it's a fair trade.

JossWhittle commented 10 years ago

Looking at the tokenizer I just realized in the case of a \ for a preprocessor directive it requires the backslash is immediately followed by a newline character. On line 422 of rule parser there are 3 space characters after the line which causes the error.

This just leaves the other error in tuple which seems to be an issue with arbitrary syntax being handled with macros. I'm not sure there is a lot that can be done with that, sadly.

JossWhittle commented 10 years ago

It appears the back-tick character isn't actually a valid char to have within the C++ standard. So this is actually an intentional error caused by the follow tokenizer case.

case '`':
    FBEXCEPTION("Invalid character: " + string(1, c) + " in " + string(file + ":" + to_string(line)));
    break;

I have a feeling we can deal with this more elegantly. Perhaps with we pass the errorFile object to the tokenizer and give it the ability to add an error to the output.

kanielc commented 10 years ago

It's a bit strange that the code compiles but is in gross violation of the standard. I agree though, it's better to note it in the errors that to just do a vague and file-fatal exception.

JossWhittle commented 10 years ago

https://github.com/L2Program/FlintPlusPlus/commit/17054531aa45860a0bc28ed07dd40113fd47bd06#diff-f4ae09eb4f1dde12a46b7d73178318b8L386

I've made the changes to the tokenizer, boost now lints entirely and we issue an Error in the output for the invalid character.