abduld / include-what-you-use

Automatically exported from code.google.com/p/include-what-you-use
Other
0 stars 0 forks source link

IWYU suggest too much include: confusion with folders #144

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago

Hi!

I am having an issue where IWYU suggest to include a header that is not 
necessary. The case is:

class A -> parent class

class B -> inherits from class A

in class B cpp, IWYU suggest to include classA.h, but that's already included 
in classB.h, since it's necessary to declare the inheritance.

Digging a bit in the issue I found that this problem appears under certain 
combination of additional include folders. In our codebase we have a folder 
structure where we have a solution in the root folder, and all the libs are one 
level below this root.

root
  |___ libA
  |___ libB
  |___ libC

Our libs set as additional include folder its parent, and all includes are in 
the form of:

#include "libB/path/to/headerA.h"
#include "libC/path/to/headerB.h"

However, for IWYU to work, I must add as an additional include folder the path 
to the lib itself, otherwise it won't find the precompiled header.

So the command line becomes something like this:

include-what-you-use.exe F:\Test\libA\ClassB.cpp -IF:/Test/ -IF:/Test/libA/ 
-Xiwyu --pch_in_code -Xiwyu --prefix_header_includes=keep

With the attached files, IWYU will report:

Source file F:/Test/libA/ClassB.cpp should add:
#include "libA/classA.h"             // for classA

which is not correct.

However, if I change classB header include from this:

#include "libA/classA.h"

to this:

#include "classA.h"

it will report that everything is ok, but in practice nothing changed, the 
files are the same, and we want to follow that path convention.

Do you have any idea about could be wrong here?

Thank you!

Original issue reported on code.google.com by dpun...@gmail.com on 4 Jun 2014 at 9:50

Attachments:

GoogleCodeExporter commented 9 years ago

If it's any help, for the case where IWYU fails I get this:

--- Calculating IWYU violations for F:/tests/TestProj/TestLib/ClassB.cpp ---
Ignoring use of classA::classA (F:/tests/TestProj/TestLib/ClassB.cpp:7:9): 
member of class
Ignoring use of classA::Reset (F:/tests/TestProj/TestLib/ClassB.cpp:13:2): 
member of class
Marked use of include-file  (from "TestLib/classB.h") at Invalid location
Mapped F:/tests/TestProj/TestLib/classB.h to "TestLib/classB.h" for classB 
(only candidate)
Mapped F:/tests/TestProj/TestLib/classB.h to "TestLib/classB.h" for classB 
(only candidate)
Mapped F:/tests/TestProj/TestLib/classB.h to "TestLib/classB.h" for classB 
(only candidate)
Mapped F:/tests/TestProj/TestLib/classA.h to "classA.h" for classA (only 
candidate)
Mapped F:/tests/TestProj/TestLib/classB.h to "TestLib/classB.h" for classB 
(only candidate)
Ignoring full use of  (Invalid location): #including dfn from "stdafx.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:7:1): 
#including dfn from "TestLib/classB.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:11:6): 
#including dfn from "TestLib/classB.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:13:10): 
#including dfn from "TestLib/classB.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:13:10): 
#including dfn from "TestLib/classB.h"
Ignoring full use of  (Invalid location): #including dfn from "TestLib/classB.h"
F:/tests/TestProj/TestLib/ClassB.cpp:13:2: warning: classA is defined in 
"classA.h", which isn't directly #included.

When it works I get this:

--- Calculating IWYU violations for F:/tests/TestProj/TestLib/ClassB.cpp ---
Ignoring use of classA::classA (F:/tests/TestProj/TestLib/ClassB.cpp:7:9): 
member of class
Ignoring use of classA::Reset (F:/tests/TestProj/TestLib/ClassB.cpp:13:2): 
member of class
Marked use of include-file  (from "TestLib/classB.h") at Invalid location
Mapped F:/tests/TestProj/TestLib/classB.h to "TestLib/classB.h" for classB 
(only candidate)
Mapped F:/tests/TestProj/TestLib/classB.h to "TestLib/classB.h" for classB 
(only candidate)
Mapped F:/tests/TestProj/TestLib/classB.h to "TestLib/classB.h" for classB 
(only candidate)
Mapped F:/tests/TestProj/TestLib/classA.h to "classA.h" for classA (only 
candidate)
Mapped F:/tests/TestProj/TestLib/classB.h to "TestLib/classB.h" for classB 
(only candidate)
Ignoring full use of  (Invalid location): #including dfn from "stdafx.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:7:1): 
#including dfn from "TestLib/classB.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:11:6): 
#including dfn from "TestLib/classB.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:13:10): 
#including dfn from "TestLib/classB.h"
Ignoring full use of classA (F:/tests/TestProj/TestLib/ClassB.cpp:13:2): 
#including dfn from "classA.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:13:10): 
#including dfn from "TestLib/classB.h"
Ignoring full use of  (Invalid location): #including dfn from "TestLib/classB.h"

Original comment by dpun...@gmail.com on 4 Jun 2014 at 9:57

GoogleCodeExporter commented 9 years ago

Hm, this issue seems to appear in many different places and apparently at 
random, It's making me second-guess all the suggestions now. I would really 
like to find a solution so I'm going to try to fix it myself and if I find a 
way propose it to you.

If you could give me a hint about where to start looking it would help a lot. 
Thank you.

Original comment by dpun...@gmail.com on 6 Jun 2014 at 2:16

GoogleCodeExporter commented 9 years ago
I can't say for sure what's going on, but I seem to recall there's a little 
blurb in iwyu_path_util.cc/ConvertToQuotedInclude:

  // HeaderSearchPaths is sorted to be longest-first, so this
  // loop will prefer the longest prefix: /usr/include/c++/4.4/foo
  // will be mapped to <foo>, not <c++/4.4/foo>.
  for (Each<HeaderSearchPath> it(&search_paths); !it.AtEnd(); ++it) {
    if (StripLeft(&path, it->path)) {
      StripLeft(&path, "/");
      if (it->path_type == HeaderSearchPath::kSystemPath)
        return "<" + path + ">";
      else
        return "\"" + path + "\"";
    }
  }

This is probably why it prefers the shortest include name when there are 
multiple options.

I don't have time to look at this at the moment, but I find it confusing that 
stdafx.h puts special demands on the header search path? Where is it located? 
The example in your attached rar is just a flat structure, so that can't 
provoke the problem, right?

Original comment by kim.gras...@gmail.com on 6 Jun 2014 at 2:25

GoogleCodeExporter commented 9 years ago

The thing is on MSVC the precompiled header MUST be just the name of the file 
(stdafx.h) with no path etc.

In the settings, since we want all paths to be relative from the root of the 
solution we only add as additional include folder the root of the solution, and 
not the root of the project. If you do the same with IWYU it won't work, 
because if you have a file in a subfolder inside the project it won't be able 
to compile it properly since it won't find stdafx.h. So I have to give him the 
root of the solution AND the root of the project.

Kim I was trying to understand what's going on with this, isn't it strange that 
IWYU is reporting these logs when it fails?

Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:7:1): 
#including dfn from "TestLib/classB.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:11:6): 
#including dfn from "TestLib/classB.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:13:10): 
#including dfn from "TestLib/classB.h"
Ignoring full use of classB (F:/tests/TestProj/TestLib/ClassB.cpp:13:10): 
#including dfn from "TestLib/classB.h"

It doesn't seem to find any reference to classA.h

Original comment by dpun...@gmail.com on 6 Jun 2014 at 2:31

GoogleCodeExporter commented 9 years ago

Also I attach here the full solution to understand better.

Original comment by dpun...@gmail.com on 6 Jun 2014 at 2:34

Attachments:

GoogleCodeExporter commented 9 years ago

I see now that in the wrong case classA.h is missing from the 
effective_direct_includes list, while when it works it is there. I will keep 
digging.

Original comment by dpun...@gmail.com on 6 Jun 2014 at 2:47

GoogleCodeExporter commented 9 years ago
> The thing is on MSVC the precompiled header MUST be just the name of
> the file (stdafx.h) with no path etc.

For reference, this is not the case. We use #include "prefix/stdafx.h" in our 
project at work. As long as you adjust the "Precompiled Header File" setting to 
spell the same name as the one you include, it works fine.

> In the settings, since we want all paths to be relative from the root of the
> solution we only add as additional include folder the root of the solution,
> and not the root of the project. If you do the same with IWYU it won't work,
> because if you have a file in a subfolder inside the project it won't be able 
to 
> compile it properly since it won't find stdafx.h.

I wonder if this is a compatiblity problem between Clang and MSVC. I know there 
have been specific commits to Clang to make header search MSVC-compatible 
before, but maybe there's a case they've missed? I don't think IWYU does 
anything specific here, but I'm not 100% sure.

> Mapped F:/tests/TestProj/TestLib/classA.h to "classA.h" for classA (only 
candidate)

This looks suspicious. See if you can find where that message is logged and 
then dig backwards from there.

Original comment by kim.gras...@gmail.com on 6 Jun 2014 at 3:03

GoogleCodeExporter commented 9 years ago
With your full project this seems to work without any stdafx.h confusion 
(though it does reproduce the bug):

D:\temp\dpunset\TestProj$ include-what-you-use.exe TestLib\ClassB.cpp -I. 
-Xiwyu --pch_in_code

So you shouldn't need to spell more than one include path in any case.

Aah... This is the same issue we've seen before. Just because classB derives 
from classA it doesn't mean it re-exports classA.

I'm still not sure if this is a conscious design decision/implemented behavior, 
but here's an argument I could make and have made:

- classB currently derives from classA, though that's an implementation detail
- IWYU mean "Include what you *use*", so when you use classA explicitly in 
classB, IWYU does not rely on that implementation detail, but instead suggests 
that you #include classA.h.

So if you changed classB not to derive from classA, but hold a pointer to it, 
you wouldn't want to change classB.cpp's includes.

But this is causing so much confusion that I wonder if it's really a sane 
policy (and as I said, I'm not even sure if it *is* an actual policy or just 
coincidental)

Original comment by kim.gras...@gmail.com on 6 Jun 2014 at 3:17

GoogleCodeExporter commented 9 years ago

You are right about the precompiled header path, sorry. MSVC just make sure 
that files include exactly what the setting for the precompiled header is, and 
it can include a path as long as it is in the setting too.

I read all what you said, but then why does it works sometimes? Depending on 
the path I use it will suggest to include classA or not. This seems strange, 
right?

All the debugging I'm doing is trying to compare the two cases, when it works 
and when it doesn't. It seems to me that it all comes down to a different 
behavior in CalculateIwyuForFullUse.

When it is working I get there with this:

actual_includes: TestLib/classB.h, classA.h, stdafx.h
use->suggested_header: classA.h

When it is not working properly I get this:

actual_includes: TestLib/classA.h, TestLib/classB.h, stdafx.h
use->suggested_header: classA.h

ContainsKey at the end returns false because the paths are not the same (but 
the files are the same), and that is set as a violation.

This really looks like a path confusion no?

Original comment by dpun...@gmail.com on 6 Jun 2014 at 3:40

GoogleCodeExporter commented 9 years ago

I modified a bit CalculateIwyuForFullUse so that it gets the FileEntries of the 
actual include files, so that I can inspect the full path of those includes and 
contrast them with the 'use'. I manage to find the header I want because I 
don't rely on relative paths anymore, so I don't set the use as a iwyu 
violation, but I still get the report telling me to add classA.h.

Most probably this is not the best approach, but I'm trying to understand 
what's going on. I'll keep investigating.

Original comment by dpun...@gmail.com on 6 Jun 2014 at 5:44

GoogleCodeExporter commented 9 years ago

One thing that's worth noting though is that now the log for the two cases is 
exactly the same, but the result differs. I have put verbose at 1000, so maybe 
the code is missing some log somewhere.

Original comment by dpun...@gmail.com on 6 Jun 2014 at 5:47

GoogleCodeExporter commented 9 years ago

I found what's making it to still report the issue. I found this part:

  // If we #include a .h through an associated file (foo.h) rather
  // than directly (foo.cc), we don't want to say that .h is desired
  // -- that will cause us to add it when it's unnecessary.  We could
  // choose to actually *remove* the .h here if it's present, to keep
  // #includes to a minimum, but for now we just decline to add it.
  for (vector<OneIncludeOrForwardDeclareLine>::iterator
           it = lines->begin(); it != lines->end(); ++it) {
    if (it->is_desired() && !it->is_present() && it->IsIncludeLine() &&
        ContainsKey(associated_desired_includes, it->quoted_include())) {
      it->clear_desired();
    }
  }

It's the same problem as before, because the relative path is not exactly the 
same it fails. Using FileEntries to compare the full name helps me fix it.

What can you tell me about this? Do you think it's a good solution to not rely 
on relative paths?

Original comment by dpun...@gmail.com on 6 Jun 2014 at 6:38

GoogleCodeExporter commented 9 years ago

This block also could not work properly with precompiled headers, right?

IwyuPreprocessorInfo::BelongsToMainCompilationUnit

// Heuristic: if the main compilation unit's *first* include is
  // a file with the same basename, assume that it's the 'related'
  // .h file, even if the canonical names differ.  This catches
  // cases like 'foo/x.cc' #includes 'foo/public/x.h', or
  // 'foo/mailserver/x.cc' #includes 'foo/public/x.h'.
  if (includer == main_file_ &&
      ContainsKeyValue(num_includes_seen_, includer, 1)) {
    if (GetCanonicalName(Basename(GetFilePath(includee))) ==
        GetCanonicalName(Basename(GetFilePath(main_file_))))
      return true;
  }

with precompiled headers the first file is supposed to be the precompiled header

Original comment by dpun...@gmail.com on 6 Jun 2014 at 7:32

GoogleCodeExporter commented 9 years ago

In order to make it work I had to do this:

// Heuristic: if the main compilation unit's *first* include is
  // a file with the same basename, assume that it's the 'related'
  // .h file, even if the canonical names differ.  This catches
  // cases like 'foo/x.cc' #includes 'foo/public/x.h', or
  // 'foo/mailserver/x.cc' #includes 'foo/public/x.h'.
  int index = GlobalFlags().pch_in_code ? 2 : 1;
  if (includer == main_file_ &&
      ContainsKeyValue(num_includes_seen_, includer, index)) {
    if (GetCanonicalName(Basename(GetFilePath(includee))) ==
        GetCanonicalName(Basename(GetFilePath(main_file_))))
      return true;
  }

Seems ok?

Original comment by dpun...@gmail.com on 6 Jun 2014 at 8:01

GoogleCodeExporter commented 9 years ago
Oops, yes, that last one seems like a proper bug fix for a problem introduced 
as part of pch_in_code. There's obviously a test missing for that heuristic.

Original comment by kim.gras...@gmail.com on 6 Jun 2014 at 8:45

GoogleCodeExporter commented 9 years ago
Thank you for the hard work with troubleshooting!

As for the confused paths, I can't quite see through it right now, but it 
sounds like you're onto a real problem.

> I read all what you said, but then why does it works sometimes?
> Depending on the path I use it will suggest to include classA
> or not. This seems strange, right?

It does. And this is why I hand-waved a lot around the details, it probably 
isn't implemented the way I said, it's just acting like it sometimes. :-/

With your changes in place, do the IWYU tests pass? Run "python 
run_iwyu_tests.py". On Windows I usually see six failures (badinc, 
overloaded_class, prefix_header_attribution, prefix_header_operator_new, 
precomputed_tpl_args, iterator) so if you see any more, you've broken something 
:-)

I'll have to look at this with fresh eyes, but moving away from comparing 
strings toward comparing FileEntry pointers seems good overall.

Original comment by kim.gras...@gmail.com on 6 Jun 2014 at 8:56

GoogleCodeExporter commented 9 years ago

I did some tests and it seems good. I need to cleanup my fix, then I'll do the 
tests you mention, then I can post it here. All this on monday :)

Have a good weekend!

Original comment by dpun...@gmail.com on 6 Jun 2014 at 9:23

GoogleCodeExporter commented 9 years ago

Hello,

I ran the iwyu tests, and there is one that fails for me that doesn't seem to 
fail for you:

======================================================================
FAIL: runTest (__main__.comment_pragmas)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "run_iwyu_tests.py", line 156, in <lambda>
    {'runTest': lambda self, f=filename: self.RunOneTest(f),
  File "run_iwyu_tests.py", line 127, in RunOneTest
    iwyu_flags, clang_flags, verbose=True)
  File "D:\Projects\llvm-3.5\tools\clang\tools\include-what-you-use\iwyu_test_util.py", line 424, in TestIwyuOnRelativeFile
    test_case.assertTrue(not failures, ''.join(failures))
AssertionError: False is not true : 
tests/cxx/comment_pragmas.cc:116: Unmatched regex:
CommentPragmasD2 is...*no_such_file.h

tests/cxx/comment_pragmas.cc:121: Unmatched regex:
CommentPragmasD3 is...*comment_pragmas-i6.h

tests/cxx/comment_pragmas.cc:126: Unmatched regex:
CommentPragmasD4 is...*comment_pragmas-i7.h

tests/cxx/comment_pragmas.cc:137: Unmatched regex:
CommentPragmasD8 is...*<some_system_header_file>

tests/cxx/comment_pragmas.cc:140: Unmatched regex:
CommentPragmasD9 is...*<some_system_header_file>

tests/cxx/comment_pragmas.cc:172: Unmatched regex:
CommentPragmasD17 is...*no_such_file_d17.h

However, do you think this could be related to my change? I wonder if it was 
failing before my changes.

I am not very happy with the test I had to do :( The problem is I couldn't find 
a good way of getting the FileEntry from just a string, and the problem is that 
in many places files are reference through a string and not a FileEntry. 
Couldn't IWYU just use FileEntry?

To give you an idea of what I had to do:

1) I had to modify CalculateMinimalIncludes to not only build a list of desired 
includes as string, but also a list of desired includes as FileEntry. Basically 
for each string added from the use::suggestedheader, I was adding also the 
decl_file. However, at Step3 it is going through a list of public headers, and 
this is all strings, so I am not sure what to do with that.

2) Now that we can get a list of FileEntry we can set the 
desired_includes_as_fileentries in CalculateIwyuViolations.

3) We get this list at the begining of CalculateAndReportIwyuViolations, and we 
pass it to CalculateDesiredIncludesAndForwardDeclares.

4) In CalculateDesiredIncludesAndForwardDeclares, at the last loop, we first 
see if the list of string contains the key we want, and if it doesn't then we 
check with the list of FileEntry that we just passed. Only the test with 
FileEntry should be necessary, but since the list may be incomplete due to the 
issue with the public headers I was afraid to do that.

I'm almost certain there must be a cleaner way of doing all this, possibly just 
using FileEntry pointers and not relying on the strings.

What do you think?

Original comment by dpun...@gmail.com on 9 Jun 2014 at 4:43

GoogleCodeExporter commented 9 years ago
Just to let you know I'm not ignoring this out of spite. I'm having a busy few 
weeks this and the next, but I hope to get back to this when I get back, unless 
vsapsai beats me to it :-)

Original comment by kim.gras...@gmail.com on 12 Jun 2014 at 7:35

GoogleCodeExporter commented 9 years ago
I haven't investigated it carefully, but I can say that we want to move from 
string comparison to FileEntry comparison.  It will help to fix <sys/errno.h> 
!= <errno.h>, "foo.h" != <foo.h>.  And it will help to resolve issue #5.  So at 
the first glance, FileEntry comparison is fine.

Original comment by vsap...@gmail.com on 12 Jun 2014 at 7:41

GoogleCodeExporter commented 9 years ago

Thank you, that sounds good. I think it would resolve a few issues very nicely 
(and probably be more efficient).

What I did works so far, but it's not complete because of these other paths we 
store just with the string.

Original comment by dpun...@gmail.com on 12 Jun 2014 at 8:03

GoogleCodeExporter commented 9 years ago
Here's a patch with test coverage for the bug dpunset found in comment 14 
(https://code.google.com/p/include-what-you-use/issues/detail?id=144#c14).

I haven't started looking into FileEntryfication yet, but it'd be nice to get 
this fix in independently.

Original comment by kim.gras...@gmail.com on 17 Aug 2014 at 9:37

Attachments:

GoogleCodeExporter commented 9 years ago
Looks good to me, works on Mac OS X.

Original comment by vsap...@gmail.com on 17 Aug 2014 at 11:13

GoogleCodeExporter commented 9 years ago
Thanks, committed in r564. I'll see if I can get my head around moving toward 
FileEntry everywhere, but I haven't given it much thought yet, so it might be a 
while.

David, do you have a patch you could share of your work so far?

Original comment by kim.gras...@gmail.com on 18 Aug 2014 at 6:47

GoogleCodeExporter commented 9 years ago

I would happily post one, although I never built one myself :_) Can you tell me 
what tool do you use to prepare the patch?

Original comment by dpun...@gmail.com on 18 Aug 2014 at 7:07

GoogleCodeExporter commented 9 years ago
If you use SVN from the command-line, cd into the include-what-you-use 
directory and do:

  $ svn diff > fileentry.patch

If you use TortoiseSVN, right-click the include-what-you-use directory and 
select "Create patch..." down toward the bottom of the context menu. It will 
prompt you with a dialog for selecting which files -- only include the ones 
you've modified (or properly added) -- and then you get to save the changes to 
a file.

Attach the patch to the issue and I should be able to apply it locally.

Thanks!

Original comment by kim.gras...@gmail.com on 18 Aug 2014 at 7:54

GoogleCodeExporter commented 9 years ago

Kim I haven't forgotten about it, it's just been crazy days at the office. I 
will try to do this tomorrow while I integrate your latest changes in the trunk 
to fix the issues with templates.

Original comment by dpun...@gmail.com on 20 Aug 2014 at 7:58

GoogleCodeExporter commented 9 years ago

This is what got generated, although I think there is more changes than what I 
did O_o maybe I did something wrong?

Original comment by dpun...@gmail.com on 27 Aug 2014 at 12:30

Attachments:

GoogleCodeExporter commented 9 years ago
It looks like it undoes one of my later changes, but at least it gives me an 
idea of where you had to make changes to get closer to using FileEntry 
consistently.

I'll see if I can understand it and build on it, thanks!

Original comment by kim.gras...@gmail.com on 27 Aug 2014 at 6:41

GoogleCodeExporter commented 9 years ago

If it's undoing something you did then I must have merged badly at some point. 
I will sync again tomorrow and merge, the only differences I should have are 
the case-insensitive change and the file entries.

Thank you!

Original comment by dpun...@gmail.com on 27 Aug 2014 at 7:05