ned14 / pcpp

A C99 preprocessor written in pure Python
Other
220 stars 41 forks source link

Add a "--no-include" option for "light-weighted" pre-processing. #34

Closed smwikipedia closed 4 years ago

smwikipedia commented 4 years ago

Add a "--no-include" option for "light-weighted" pre-processing, which will disable the handling of the #include directive. This can help me use pcpp to generate include map for a .c/.h file. And it can kind of meet the requirements from below links:

https://stackoverflow.com/questions/33409754/how-to-expand-macros-only-as-a-preprocessing-step-to-a-c-file https://stackoverflow.com/questions/2615931/run-a-light-preprocessor-for-gcc

ned14 commented 4 years ago

I'm a little surprised at the need for such an option; including local files is usually desirable, and you can prevent the inclusion of non-local files, including system headers, by not giving it search paths.

Can you give me an example where never including any files at all would be desirable?

smwikipedia commented 4 years ago

I'm a little surprised at the need for such an option; including local files is usually desirable, and you can prevent the inclusion of non-local files, including system headers, by not giving it search paths.

Can you give me an example where never including any files at all would be desirable?

I am trying to generate a include map to reflect the header relations. This include local and system headers. The detailed pre-process steps I need is like this:

  1. My script provides a .c/.h file to pcpp.

  2. And my script also provides defined macros to let pcpp handle things like #ifdef, etc.

  3. But my script needs to be responsible to resolve the relative header paths in the partially pre-processed headers by pcpp in step 2. If there's no --no-include option, pcpp will *implicitly process the included header in the same folder as the including file. And my script has no idea of this happening at all. Thus the include hierarchy will be short-cut and the include map will miss those included header.

Hope I made myself clear.

smwikipedia commented 4 years ago

I just realized a potential semantic difference.

When I am trying to generate the include graph, I passed a fixed set of macro definitions to the preprocessor as the context of preprocessing. And my current approach expects the preprocessor to leave the #include alone during processing.

For example,

(example.c)

#ifdef H1
#include <header1.h>
#endif

#ifdef H2
#include <header2.h>
#endif

I will pass only -DH1 to the preprocessor. Then with my current include graph approach and pcpp with --no-include option, example.c will depend on header1.h but not header2.h.

But if header1.h defines H2, example.c will depend on header2.h as well. My approach will miss the header2.h because I missed the macro context change introduced by header1.h.

So the handling of #include directive is kind of a depth-first process, which is guided by current macro context of the compilation unit. And once a header is included, it will potentially change the current macro context and thus affect the later #include handling.

I am not quite sure about the detailed C99 preprocessing rules, which I will check some spec. But I believe a macro def/undef effective scope is from where it is defined/undefined to the end of the compilation unit.

So if the macro context is not properly updated, or there's any unresolved headers, the preprocessing result can be incorrect.

smwikipedia commented 4 years ago

Some quotes: The ‘#include’ directive works by directing the C preprocessor to scan the specified file as input before continuing with the rest of the current file. The output from the preprocessor contains the output already generated, followed by the output resulting from the included file, followed by the output that comes from the text after the ‘#include’ directive. -- ref 1

And also from above link, the compiler will see the same token stream as if the header content is directly placed there.

So the macros from an included file will affect the preprocessing of the remaining part of the compilation unit.

So, the --no-include option can be helpful to provide a light-weighted result if the header doesn't affect the remaining preprocessing. Otherwise, the light-weighted result will be inaccurate.

smwikipedia commented 4 years ago

pcpp is a useful tool. But for now I turn to use the gcc preprocessing instead. Because it provide more flag info in the line markers of the final preprocessed file, which leads to a much simpler and more accurate solution for me. (ref) Thanks.

ned14 commented 4 years ago

For me personally, if I were making an include map, I'd write a little bit of Python extending Preprocessor to intercept each file inclusion, and storing out each file that is included, or would be included.

mmomtchev commented 3 years ago

@ned14 @smwikipedia I have a very particular use case where I want to expand a single macro (I am dropping support for an obsolete API by expanding its #ifdef macros), I think this was a useful feature