Open PathogenDavid opened 4 years ago
This is also somewhat of an issue with MoveLooseDeclarationsIntoTypesTransformation
because it uses the file paths to determine which type the loose declaration is associated with. (Although if the wrong casing was used, the generator is non-portable anyway.)
Here's the scenario that the case-insensitive path comparisons in Biohazrd is trying to solve. Consider the following library with two header files:
FileA.h
#pragma once
#include "fileb.h" // <-- Notice the wrong casing here
void FunctionA();
FIleB.h
#pragma once
void FunctionB();
Now consider the user of Biohazrd requests that it builds FileA.h
and FileB.h
, the generated index file will look like:
#include "FileA.h"
#include "FileB.h
The cursor tree we get back from Clang will look roughly like the following:
TranslationUnitDecl
corresponding to <>TranslatedLibraryIndexFile.cpp
(the Biohazrd index file)
FunctionDecl
for FunctionB
corresponding to fileb.h
FunctionDecl
for FunctionA
corresponding to FileA.h
Note that FunctionB
showed up first despite being included second in the index file. This happened because FileA.h
included FileB.h
as fileb.h
. The pre-processed version of the file as seen by Clang looks roughly like this:
#line 3 "fileb.h"
void FunctionB();
#line 4 "FileA.h"
void FunctionA();
When we encounter the FunctionB
declaration, does it correspond to FileB.h
? The answer depends on whether or not the file system was case-sensitive. Clang actually provides a method for getting the real path (which we do use), but that doesn't help us in certain weird situations. (Also we prefer the path provided to us by the user, which isn't necessarily the best either -- see #67)
(*We actually probably have a bug here: On debug builds I think this triggers an assert but in release builds the out-of-scope file becomes in-scope if it is encountered first and the in-scope file implicitly becomes out-of-scope.)
#include
directives with the wrong casing.)Poorly-formed libraries can't be processed on Linux when they should be able to.
It is conceivable that you might want to use Biohazrd on a Linux CI server to process one of the poorly-formed Windows-only libraries mentioned previously. To enable this scenario, you might use EXT4's ability to mark a directory as case-insensitive, but this assumption breaks that.
It is conceivable you might want to use Biohazrd in the context of WSL, Samba, or a NTFS drive. In all three of these scenarios the file system is case-insensitive and assuming it is would break things for poorly-formed libraries.
TranslatedLibraryBuilder
treats file paths as case-insensitive regardless of the case-sensitivity of the file system. This was done to avoid problems arising between a difference in casing between what is provided toTranslatedLibraryBuilder
, the file system, and the#include
directives in the C++ source. (We regularly compare filenames for the sake of resolving which Clang cursors correspond to which input files -- or to determine if the cursor is out-of-scope.)We do not expect well-formed C++ code to involve incorrect casing or to have multiple files with the same casing as both practices are non-portable. (The former is a warning
-Wnonportable-include-path
in Clang.) As such, this is not really a high priority as it's only really problematic with poorly-written C++ libraries on Linux or unusual macOS/Windows systems.The ideal solution would be to normalize paths to their actual casing once https://github.com/dotnet/runtime/issues/14321 is realized.
I am hesitant to conditionally use case-sensitive comparisons based on the
OSPlatform
because case-sensitivity is an attribute of the file system, not the OS. (ext4 even allows case-sensitivity as an attribute of a directory.)