compnerd / ids

Interface Analysis Utility
BSD 3-Clause "New" or "Revised" License
0 stars 3 forks source link

Support bulk source processing mode #28

Open compnerd opened 7 months ago

compnerd commented 7 months ago

Clang permits a bulk mode where a number of source files are provided to the frontend to compile and link as a single invocation. This requires spinning up a new frontend per source file. We currently create a single tool instance which is equivalent to a single invocation.

Reported by @jvstech!

jvstech commented 7 months ago

As we discussed, I'm not convinced this is a bug in the program code so much as it is in Clang tooling. The program appears to be doing things that should be allowed.

The documentation for the clang::tooling::ClangTool class and its constructor state:

/// Utility to run a FrontendAction over a set of files.
///
/// This class is written to be usable for command line utilities.
/// By default the class uses ClangSyntaxOnlyAdjuster to modify
/// command line arguments before the arguments are used to run
/// a frontend action. One could install an additional command line
/// arguments adjuster by calling the appendArgumentsAdjuster() method.
  /// Constructs a clang tool to run over a list of files.
  ///
  /// \param Compilations The CompilationDatabase which contains the compile
  ///        command lines for the given source paths.
  /// \param SourcePaths The source files to run over. If a source files is
  ///        not found in Compilations, it is skipped.
  /// \param PCHContainerOps The PCHContainerOperations for loading and creating
  /// clang modules.
  /// \param BaseFS VFS used for all underlying file accesses when running the
  /// tool.
  /// \param Files The file manager to use for underlying file operations when
  /// running the tool.
  ClangTool(const CompilationDatabase &Compilations,
            ArrayRef<std::string> SourcePaths,
            std::shared_ptr<PCHContainerOperations> PCHContainerOps =
                std::make_shared<PCHContainerOperations>(),
            IntrusiveRefCntPtr<llvm::vfs::FileSystem> BaseFS =
                llvm::vfs::getRealFileSystem(),
            IntrusiveRefCntPtr<FileManager> Files = nullptr);

Additionally, this is what the direct header comments say:

//  This file implements functions to run clang tools standalone instead
//  of running them as a plugin.
//
//  A ClangTool is initialized with a CompilationDatabase and a set of files
//  to run over. The tool will then run a user-specified FrontendAction over
//  all TUs in which the given files are compiled.
//
//  It is also possible to run a FrontendAction over a snippet of code by
//  calling runToolOnCode, which is useful for unit testing.
//
//  Applications that need more fine grained control over how to run
//  multiple FrontendActions over code can use ToolInvocation.
//
//  Example tools:
//  - running clang -fsyntax-only over source code from an editor to get
//    fast syntax checks
//  - running match/replace tools over C++ code

And this is exactly what idt is doing. Specifically, there are multiple call outs to being able to use more than one source file.

In my very cursory debugging, the issue seems to be that the clang::DiagnosticsEngine reference held by the clang::SourceManager (under the clang::ASTContext provided by clang::CompilerInstance) is getting a stale set of diagnostic IDs when a new source file is handed to it. In our case, after giving idt a list of all the header files in llvm/include/llvm/Support/, it runs out of custom diagnostic IDs after the 70th file when the max is exceeded.

clang does it differently since it spins up a clang::CompilerInstance for each input file it's given, but Clang tooling is supposed to make it so that isn't required.

compnerd commented 7 months ago

Interesting! So it could be a clang issue after all. Definitely does seem like it should work.