AlDanial / cloc

cloc counts blank lines, comment lines, and physical lines of source code in many programming languages.
GNU General Public License v2.0
19.78k stars 1.02k forks source link

(Correctly) detect c++ module files #733

Closed FabioFracassi closed 1 year ago

FabioFracassi commented 1 year ago

Describe the bug c++ supports modules (since C++20). Recently more compilers (MSVC, clang) start supporting them in practice. with this there are two new extensions(.cppm and .ixx) that are/will be used for c++ code. cloc ignores .cppm files and counts .ixx files as "Visual Studio Module" (which is not strictly wrong but unhelpful IMO).

cloc; OS; OS version

To Reproduce scan a directory that contains .cppm and/or .ixx files

Expected result the .cppm and .ixx files are counted and treated as c++ code

Additional context the extensions are not prescribed by the c++ standard, but these two (.cppm, .ixx) seem to have some consensus with compiler and tool authors.

It is conceivable to treat module code separately e.g. have a language definition "C++ module" similar to "C/C++ Header", but I think the utility of that would be limited, first because "normal" .cpp/.cxx/... C++ files can contain modules as well, and in addition I personally never liked differentiating between "C++" and "C/C++ Headers".

FabioFracassi commented 1 year ago

Additional data point: CMake added a few more extensions to be treated as c++ in their most recent release (3.27.0): "The CXX language now treats source file extensions .ccm, .cxxm, and .c++m as C++."

AlDanial commented 1 year ago

The next release will have these updates. Until then you can get the desired behavior with v1.96 by making a cloc configuration file with these entries:

--force-lang=C++,cppm
--force-lang=C++,ixx
--force-lang=C++,ccm
--force-lang=C++,cxxm
--force-lang=C++,c++m

Regarding "C/C++ Headers": how can one distinguish between a .h file for a C program from a .h file for a C++ program? Sure, C++ will understand a C .h but the converse is not true.