felipeprov / include-what-you-use

Automatically exported from code.google.com/p/include-what-you-use
Other
0 stars 0 forks source link

Multiple locations for list of extensions #82

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
There are at least three places where a list of extensions for source files is 
made or used in the code. At least one of them (the python one) doesn't match 
the others.  Ideally, we'd find a DRY solution allowing a single list of these 
extensions to be used in all locations.  (They can be usually found by grepping 
for "\.cpp", since comments typically use ".cc" so ".cpp" creates fewer false 
positives.)

iwyu_path_util.cc:
A. around line 45
B. around line 96

fix_includes.py
C. line 155

Kinds of lists involved:
- Source (non-header) file extensions, at A, B, and C
- Header file extensions - b, but assumed to be the complement of the previous 
at A and C

Additionally, the lists of prefixes/suffixes and of source->header conversions 
found near B may benefit from being stored as data rather than embedded in 
code. (an array of POD structs that is statically initialized?)

Original issue reported on code.google.com by ryan.pav...@gmail.com on 20 Nov 2012 at 7:35

GoogleCodeExporter commented 8 years ago
A (very incomplete!) union of the two kinds of lists is also found in 
iwyu_test_util.py starting at line 85

Original comment by ryan.pav...@gmail.com on 20 Nov 2012 at 7:50

GoogleCodeExporter commented 8 years ago
It'd be nice to unify them. Do you have any ideas for how to use the same 
definitions across C++ and Python? I'd rather not introduce even more external 
files, seeing how mapping files complicated IWYU quite a lot. Maybe codegen?

Original comment by kim.gras...@gmail.com on 21 Nov 2012 at 7:36

GoogleCodeExporter commented 8 years ago
I was thinking of a compile-time solution, at least for use with C++. I'm 
thinking there might be a bit of dirty tricks with #include that could let it 
use the same list of strings as Python.  Something like:

source_extensions.inc:
".cc",
".cpp",
".cxx",
".C"

SomeSingleCppFile.cc:

static const char* extensions[] = {
#include "source_extensions.inc"
};

Not sure how to do it with Python given that it's all runtime there - if you 
don't want to bundle the file then it would have to be something like 
"building" the python file by substitution of the list into the source.

Original comment by ryan.pav...@gmail.com on 5 Dec 2012 at 12:18

GoogleCodeExporter commented 8 years ago
I went for a lo-fi solution: cross-referencing comments. I'd prefer the build 
to be dead simple rather than relying on codegen to generate Python from a 
language-agnostic source.

Semantics of GetCanonicalName have changed slightly: previously it would only 
strip one  file extension from the header and source set, now it strips one 
header extension and then any CXX extension.

I don't think this is a problem in practice, but let me know if the current 
behavior is important enough to be preserved.

Original comment by kim.gras...@gmail.com on 10 Dec 2012 at 3:45

Attachments:

GoogleCodeExporter commented 8 years ago
I agree just to cross-reference comments. Generating fix_includes.py during 
build or reading source_extensions.inc from fix_includes.py makes IWYU harder 
to understand and to use. And I think that end users should not pay for 
developers' convenience.

What about adding other C++ extensions? I've found more extensions in 
clang::driver::types [1] and clang::FrontendOptions [2]. Should we list 
extensions explicitly or call Clang functions? I prefer listing extensions, but 
can you create an alternative patch so we can compare?

Can we have more general name than cxx_extensions? It isn't necessary right 
now, but we'll need more general name when Objective-C support is implemented. 
The best name I can offer is c_implementation_extensions, but I don't like it.

I prefer to preserve semantics of GetCanonicalName so that code corresponds to 
real-life situations. The filename has layers -inl, _test | _unittest | etc, 
extension. I don't want to break these layers.

[1] http://clang.llvm.org/doxygen/Types_8cpp_source.html#l00123
[2] http://clang.llvm.org/doxygen/FrontendOptions_8cpp_source.html#l00014

Original comment by vsap...@gmail.com on 16 Dec 2012 at 11:17

GoogleCodeExporter commented 8 years ago
> clang::FrontendOptions

This seems to map input kinds -- headers and source files are all mapped to 
IK_CXX

> clang::driver::types

This looked more promising, but given that we not only classify the extensions 
but also strip them in GetCanonicalName, we need the explicit list. Unless we 
re-wire it to get the extension of the actual filename, and strip anything that 
matches TY_CXX, but it feels like the wrong direction. I could take a stab at 
it if anyone wants to see what it could look like.

> I prefer to preserve semantics of GetCanonicalName so that
> code corresponds to real-life situations. The filename has
> layers -inl, _test | _unittest | etc, extension.
> I don't want to break these layers.

That part is still handled after the extension has been stripped. But I think I 
see what you're saying about preserving the current invariants, I'll try and 
make externally visible behavior exactly the same.

I'll harvest some more extensions from types and FrontendOptions, and post a 
new patch!

Original comment by kim.gras...@gmail.com on 26 Dec 2012 at 8:50

GoogleCodeExporter commented 8 years ago
Updated patch.

- "source extensions" instead of "cxx extensions"
- Update GetCanonicalName semantics. A bit convoluted, but current behavior is 
exactly preserved.
- Source extensions are now: ".c", ".C", ".cc", ".CC", ".cxx", ".CXX", ".cpp", 
".CPP", ".c++", ".C++", ".cp"

Original comment by kim.gras...@gmail.com on 26 Dec 2012 at 10:44

Attachments:

GoogleCodeExporter commented 8 years ago
Thanks for patch, Kim. Committed r411.

Original comment by vsap...@gmail.com on 30 Dec 2012 at 8:15