gulrak / filesystem

An implementation of C++17 std::filesystem for C++11 /C++14/C++17/C++20 on Windows, macOS, Linux and FreeBSD.
MIT License
1.31k stars 168 forks source link
cpp cpp11 cpp14 cpp17 cpp20 filesystem freebsd header-only linux macos windows-10

Supported Platforms License: MIT CMake Build Matrix Build Status Build Status Coverage Status Latest Release Tag

Filesystem

This is a header-only single-file std::filesystem compatible helper library, based on the C++17 and C++20 specs, but implemented for C++11, C++14, C++17 or C++20 (tightly following the C++17 standard with very few documented exceptions). It is currently tested on macOS 10.12/10.14/10.15/11.6, Windows 10, Ubuntu 18.04, Ubuntu 20.04, CentOS 7, CentOS 8, FreeBSD 12, Alpine ARM/ARM64 Linux and Solaris 10 but should work on other systems too, as long as you have at least a C++11 compatible compiler. It should work with Android NDK, Emscripten and I even had reports of it being used on iOS (within sandboxing constraints) and with v1.5.6 there is experimental support for QNX. The support of Android NDK, Emscripten, QNX, and since 1.5.14 GNU/Hurd and Haiku is not backed up by automated testing but PRs and bug reports are welcome for those too and they are reported to work. It is of course in its own namespace ghc::filesystem to not interfere with a regular std::filesystem should you use it in a mixed C++17 environment (which is possible).

Test coverage is well above 90%, and starting with v1.3.6 and in v1.5.0 more time was invested in benchmarking and optimizing parts of the library. I'll try to continue to optimize some parts and refactor others, striving to improve it as long as it doesn't introduce additional C++17/C++20 compatibility issues. Feedback is always welcome. Simply open an issue if you see something missing or wrong or not behaving as expected and I'll comment.

Motivation

I'm often in need of filesystem functionality, mostly fs::path, but directory access too, and when beginning to use C++11, I used that language update to try to reduce my third-party dependencies. I could drop most of what I used, but still missed some stuff that I started implementing for the fun of it. Originally I based these helpers on my own coding- and naming conventions. When C++17 was finalized, I wanted to use that interface, but it took a while, to push myself to convert my classes.

The implementation is closely based on chapter 30.10 from the C++17 standard and a draft close to that version is Working Draft N4687. It is from after the standardization of C++17 but it contains the latest filesystem interface changes compared to the Working Draft N4659. Staring with v1.4.0, when compiled using C++20, it adapts to the changes according to path sorting order and std::u8string handling from Working Draft N4860.

I want to thank the people working on improving C++, I really liked how the language evolved with C++11 and the following standards. Keep on the good work!

Why the namespace GHC?

If you ask yourself, what ghc is standing for, it is simply gulraks helper classes, yeah, I know, not very imaginative, but I wanted a short namespace and I use it in some of my private classes (so it has nothing to do with Haskell, sorry for the name clash).

Platforms

ghc::filesystem is developed on macOS but CI tested on macOS, Windows, various Linux Distributions, FreeBSD and starting with v1.5.12 on Solaris. It should work on any of these with a C++11-capable compiler. Also there are some checks to hopefully better work on Android, but as I currently don't test with the Android NDK, I wouldn't call it a supported platform yet, same is valid for using it with Emscripten. It is now part of the detected platforms, I fixed the obvious issues and ran some tests with it, so it should be fine. All in all, I don't see it replacing std::filesystem where full C++17 or C++20 is available, it doesn't try to be a "better" std::filesystem, just an almost drop-in if you can't use it (with the exception of the UTF-8 preference).

:information_source: Important: This implementation is following the "UTF-8 Everywhere" philosophy in that all std::string instances will be interpreted the same as std::u8string encoding wise and as being in UTF-8. The std::u16string will be seen as UTF-16. See Differences in API for more information.

Unit tests are currently run with:

Tests

The header comes with a set of unit-tests and uses CMake as a build tool and Catch2 as test framework. All tests are registered with in CMake, so the ctest commando can be used to run the tests.

All tests against this implementation should succeed, depending on your environment it might be that there are some warnings, e.g. if you have no rights to create Symlinks on Windows or at least the test thinks so, but these are just informative.

To build the tests from inside the project directory under macOS or Linux just:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Debug ..
make
ctest

This generates the test binaries that run the tests and the last command executes them.

If the default compiler is a GCC 8 or newer, or Clang 7 or newer, it additionally tries to build a version of the test binary compiled against GCCs/Clangs std::filesystem implementation, named std_filesystem_test as an additional test of conformance. Ideally all tests should compile and succeed with all filesystem implementations, but in reality, there are some differences in behavior, sometimes due to room for interpretation in in the standard, and there might be issues in these implementations too.

Usage

Downloads

The latest release version is v1.5.14 and source archives can be found here.

The latest pre-native-backend version is v1.4.0 and source archives can be found here.

The latest pre-C++20-support release version is v1.3.10 and source archives can be found here.

Currently only the latest minor release version receives bugfixes, so if possible, you should use the latest release.

Using it as Single-File-Header

As ghc::filesystem is at first a header-only library, it should be enough to copy the header or the include/ghc directory into your project folder or point your include path to this place and simply include the filesystem.hpp header (or ghc/filesystem.hpp if you use the subdirectory).

Everything is in the namespace ghc::filesystem, so one way to use it only as a fallback could be:

#if _MSVC_LANG >= 201703L || __cplusplus >= 201703L && defined(__has_include)
    // ^ Supports MSVC prior to 15.7 without setting /Zc:__cplusplus to fix __cplusplus
    // _MSVC_LANG works regardless. But without the switch, the compiler always reported 199711L: https://blogs.msdn.microsoft.com/vcblog/2018/04/09/msvc-now-correctly-reports-__cplusplus/
    #if __has_include(<filesystem>) // Two stage __has_include needed for MSVC 2015 and per https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005finclude.html
        #define GHC_USE_STD_FS

        // Old Apple OSs don't support std::filesystem, though the header is available at compile
        // time. In particular, std::filesystem is unavailable before macOS 10.15, iOS/tvOS 13.0,
        // and watchOS 6.0.
        #ifdef __APPLE__
            #include <Availability.h>
            // Note: This intentionally uses std::filesystem on any new Apple OS, like visionOS
            // released after std::filesystem, where std::filesystem is always available.
            // (All other __<platform>_VERSION_MIN_REQUIREDs will be undefined and thus 0.)
            #if __MAC_OS_X_VERSION_MIN_REQUIRED && __MAC_OS_X_VERSION_MIN_REQUIRED < 101500 \
             || __IPHONE_OS_VERSION_MIN_REQUIRED && __IPHONE_OS_VERSION_MIN_REQUIRED < 130000 \
             || __TV_OS_VERSION_MIN_REQUIRED && __TV_OS_VERSION_MIN_REQUIRED < 130000 \
             || __WATCH_OS_VERSION_MAX_ALLOWED && __WATCH_OS_VERSION_MAX_ALLOWED < 60000
                #undef GHC_USE_STD_FS
            #endif  
        #endif
    #endif
#endif

#ifdef GHC_USE_STD_FS
    #include <filesystem>
    namespace fs = std::filesystem;
#else
    #include "filesystem.hpp"
    namespace fs = ghc::filesystem;
#endif

If you want to also use the fstream wrapper with path support as fallback, you might use:

#if _MSVC_LANG >= 201703L || __cplusplus >= 201703L && defined(__has_include)
    // ^ Supports MSVC prior to 15.7 without setting /Zc:__cplusplus to fix __cplusplus
    // _MSVC_LANG works regardless. But without the switch, the compiler always reported 199711L: https://blogs.msdn.microsoft.com/vcblog/2018/04/09/msvc-now-correctly-reports-__cplusplus/
    #if __has_include(<filesystem>) // Two stage __has_include needed for MSVC 2015 and per https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005finclude.html
        #define GHC_USE_STD_FS

        // Old Apple OSs don't support std::filesystem, though the header is available at compile
        // time. In particular, std::filesystem is unavailable before macOS 10.15, iOS/tvOS 13.0,
        // and watchOS 6.0.
        #ifdef __APPLE__
            #include <Availability.h>
            // Note: This intentionally uses std::filesystem on any new Apple OS, like visionOS
            // released after std::filesystem, where std::filesystem is always available.
            // (All other __<platform>_VERSION_MIN_REQUIREDs will be undefined and thus 0.)
            #if __MAC_OS_X_VERSION_MIN_REQUIRED && __MAC_OS_X_VERSION_MIN_REQUIRED < 101500 \
             || __IPHONE_OS_VERSION_MIN_REQUIRED && __IPHONE_OS_VERSION_MIN_REQUIRED < 130000 \
             || __TV_OS_VERSION_MIN_REQUIRED && __TV_OS_VERSION_MIN_REQUIRED < 130000 \
             || __WATCH_OS_VERSION_MAX_ALLOWED && __WATCH_OS_VERSION_MAX_ALLOWED < 60000
                #undef GHC_USE_STD_FS
            #endif  
        #endif
    #endif
#endif

#ifdef GHC_USE_STD_FS
    #include <filesystem>
    namespace fs {
        using namespace std::filesystem;
        using ifstream = std::ifstream;
        using ofstream = std::ofstream;
        using fstream = std::fstream;
    }
#else
    #include "filesystem.hpp"
    namespace fs {
        using namespace ghc::filesystem;
        using ifstream = ghc::filesystem::ifstream;
        using ofstream = ghc::filesystem::ofstream;
        using fstream = ghc::filesystem::fstream;
    }
#endif

Now you have e.g. fs::ofstream out(somePath); and it is either the wrapper or the C++17 std::ofstream.

:information_source: Be aware, as a header-only library, it is not hiding the fact, that it uses system includes, so they "pollute" your global namespace. Use the forwarding-/implementation-header based approach (see below) to avoid this. For Windows it needs Windows.h and it might be a good idea to define WIN32_LEAN_AND_MEAN or NOMINMAX prior to including filesystem.hpp or fs_std.hpp headers to reduce pollution of your global namespace and compile time. They are not defined by ghc::filesystem to allow combination with contexts where the full Windows.his needed, e.g. for UI elements.

:information_source: Hint: There is an additional header named ghc/fs_std.hpp that implements this dynamic selection of a filesystem implementation, that you can include instead of ghc/filesystem.hpp when you want std::filesystem where available and ghc::filesystem where not.

Using it as Forwarding-/Implementation-Header

Alternatively, starting from v1.1.0 ghc::filesystem can also be used by including one of two additional wrapper headers. These allow to include a forwarded version in most places (ghc/fs_fwd.hpp) while hiding the implementation details in a single cpp file that includes ghc/fs_impl.hpp to implement the needed code. Using ghc::filesystem this way makes sure system includes are only visible from inside the cpp file, all other places are clean.

Be aware, that it is currently not supported to hide the implementation into a Windows-DLL, as a DLL interface with C++ standard templates in interfaces is a different beast. If someone is willing to give it a try, I might integrate a PR but currently working on that myself is not a priority.

If you use the forwarding/implementation approach, you can still use the dynamic switching like this:

#if _MSVC_LANG >= 201703L || __cplusplus >= 201703L && defined(__has_include)
    // ^ Supports MSVC prior to 15.7 without setting /Zc:__cplusplus to fix __cplusplus
    // _MSVC_LANG works regardless. But without the switch, the compiler always reported 199711L: https://blogs.msdn.microsoft.com/vcblog/2018/04/09/msvc-now-correctly-reports-__cplusplus/
    #if __has_include(<filesystem>) // Two stage __has_include needed for MSVC 2015 and per https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005finclude.html
        #define GHC_USE_STD_FS

        // Old Apple OSs don't support std::filesystem, though the header is available at compile
        // time. In particular, std::filesystem is unavailable before macOS 10.15, iOS/tvOS 13.0,
        // and watchOS 6.0.
        #ifdef __APPLE__
            #include <Availability.h>
            // Note: This intentionally uses std::filesystem on any new Apple OS, like visionOS
            // released after std::filesystem, where std::filesystem is always available.
            // (All other __<platform>_VERSION_MIN_REQUIREDs will be undefined and thus 0.)
            #if __MAC_OS_X_VERSION_MIN_REQUIRED && __MAC_OS_X_VERSION_MIN_REQUIRED < 101500 \
             || __IPHONE_OS_VERSION_MIN_REQUIRED && __IPHONE_OS_VERSION_MIN_REQUIRED < 130000 \
             || __TV_OS_VERSION_MIN_REQUIRED && __TV_OS_VERSION_MIN_REQUIRED < 130000 \
             || __WATCH_OS_VERSION_MAX_ALLOWED && __WATCH_OS_VERSION_MAX_ALLOWED < 60000
                #undef GHC_USE_STD_FS
            #endif  
        #endif
    #endif
#endif

#ifdef GHC_USE_STD_FS
    #include <filesystem>
    namespace fs {
        using namespace std::filesystem;
        using ifstream = std::ifstream;
        using ofstream = std::ofstream;
        using fstream = std::fstream;
    }
#else
    #include "fs_fwd.hpp"
    namespace fs {
        using namespace ghc::filesystem;
        using ifstream = ghc::filesystem::ifstream;
        using ofstream = ghc::filesystem::ofstream;
        using fstream = ghc::filesystem::fstream;
    }
#endif

and in the implementation hiding cpp, you might use (before any include that includes ghc/fs_fwd.hpp to take precedence:

#if _MSVC_LANG >= 201703L || __cplusplus >= 201703L && defined(__has_include)
    // ^ Supports MSVC prior to 15.7 without setting /Zc:__cplusplus to fix __cplusplus
    // _MSVC_LANG works regardless. But without the switch, the compiler always reported 199711L: https://blogs.msdn.microsoft.com/vcblog/2018/04/09/msvc-now-correctly-reports-__cplusplus/
    #if __has_include(<filesystem>) // Two stage __has_include needed for MSVC 2015 and per https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005finclude.html
        #define GHC_USE_STD_FS

        // Old Apple OSs don't support std::filesystem, though the header is available at compile
        // time. In particular, std::filesystem is unavailable before macOS 10.15, iOS/tvOS 13.0,
        // and watchOS 6.0.
        #ifdef __APPLE__
            #include <Availability.h>
            // Note: This intentionally uses std::filesystem on any new Apple OS, like visionOS
            // released after std::filesystem, where std::filesystem is always available.
            // (All other __<platform>_VERSION_MIN_REQUIREDs will be undefined and thus 0.)
            #if __MAC_OS_X_VERSION_MIN_REQUIRED && __MAC_OS_X_VERSION_MIN_REQUIRED < 101500 \
             || __IPHONE_OS_VERSION_MIN_REQUIRED && __IPHONE_OS_VERSION_MIN_REQUIRED < 130000 \
             || __TV_OS_VERSION_MIN_REQUIRED && __TV_OS_VERSION_MIN_REQUIRED < 130000 \
             || __WATCH_OS_VERSION_MAX_ALLOWED && __WATCH_OS_VERSION_MAX_ALLOWED < 60000
                #undef GHC_USE_STD_FS
            #endif  
        #endif
    #endif
#endif

#ifndef GHC_USE_STD_FS
    #include "fs_impl.hpp"
#endif

:information_source: Hint: There are additional helper headers, named ghc/fs_std_fwd.hpp and ghc/fs_std_impl.hpp that use this technique, so you can simply include them if you want to dynamically select the filesystem implementation.

Git Submodule and CMake

Starting from v1.1.0, it is possible to add ghc::filesystem as a git submodule, add the directory to your CMakeLists.txt with add_subdirectory() and then simply use target_link_libraries(your-target ghc_filesystem) to ensure correct include path that allow #include <ghc/filesystem.hpp> to work.

The CMakeLists.txt offers a few options to customize its behavior:

Bazel

Please use hedronvision/bazel-cc-filesystem-backport, which will automatically set everything up for you.

Versioning

There is a version macro GHC_FILESYSTEM_VERSION defined in case future changes might make it needed to react on the version, but I don't plan to break anything. It's the version as decimal number (major * 10000 + minor * 100 + patch).

:information_source: Note: Only even patch versions will be used for releases and odd patch version will only be used for in between commits while working on the next version.

Documentation

There is almost no documentation in this release, as any std::filesystem documentation would work, besides the few differences explained in the next section. So you might head over to https://en.cppreference.com/w/cpp/filesystem for a description of the components of this library.

When compiling with C++11, C++14 or C++17, the API is following the C++17 standard, where possible, with the exception that std::string_view parameters are only supported on C++17. When Compiling with C++20, ghc::filesysytem defaults to the C++20 API, with the char8_t and std::u8string interfaces and the deprecated fs::u8path factory method.

:information_source: Note: If the C++17 API should be enforced even in C++20 mode, use the define GHC_FILESYSTEM_ENFORCE_CPP17_API. Even then it is possible to create fws::path from std::u8string but fs::path::u8string() and fs::path::generic_u8string() return normal UTF-8 encoded std::string instances, so code written for C++17 could still work with ghc::filesystem when compiled with C++20.

The only additions to the standard are documented here:

ghc::filesystem::ifstream, ghc::filesystem::ofstream, ghc::filesystem::fstream

These are simple wrappers around std::ifstream, std::ofstream and std::fstream. They simply add an open() method and a constructor with an ghc::filesystem::path argument as the fstream variants in C++17 have them.

ghc::filesystem::u8arguments

This is a helper class that currently checks for UTF-8 encoding on non-Windows platforms but on Windows it fetches the command line arguments as Unicode strings from the OS with

::CommandLineToArgvW(::GetCommandLineW(), &argc)

and then converts them to UTF-8, and replaces argc and argv. It is a guard-like class that reverts its changes when going out of scope.

So basic usage is:

namespace fs = ghc::filesystem;

int main(int argc, char* argv[])
{
    fs::u8arguments u8guard(argc, argv);
    if(!u8guard.valid()) {
        std::cerr << "Bad encoding, needs UTF-8." << std::endl;
        exit(EXIT_FAILURE);
    }

    // now use argc/argv as usual, they have utf-8 encoding on windows
    // ...

    return 0;
}

That way argv is UTF-8 encoded as long as the scope from main is valid.

Note: On macOS, while debugging under Xcode the code currently will return false as Xcode starts the application with US-ASCII as encoding, no matter what encoding is actually used and even setting LC_ALL in the product scheme doesn't change anything. I still need to investigate this.

Differences

As this implementation is based on existing code from my private helper classes, it derived some constraints of it. Starting from v1.5.0 most of the differences between this and the standard C++17/C++20 API where removed.

LWG Defects

This implementation has switchable behavior for the LWG defects #2682, #2935, #2936 and #2937. The currently selected behavior (starting from v1.4.0) is following #2682, #2936, #2937 but not following #2935, as I feel it is a bug to report no error on a create_directory() or create_directories() where a regular file of the same name prohibits the creation of a directory and forces the user of those functions to double-check via fs::is_directory if it really worked. The more intuitive approach to directory creation of treating a file with that name as an error is also advocated by the newer paper WG21 P1164R0, the revision P1161R1 was agreed upon on Kona 2019 meeting see merge and GCC by now switched to following its proposal (GCC #86910).

Not Implemented on C++ before C++17

// methods in ghc::filesystem::path:
path& operator+=(basic_string_view<value_type> x);
int compare(basic_string_view<value_type> s) const;

These are not implemented under C++11 and C++14, as there is no std::basic_string_view available and I did want to keep this implementation self-contained and not write a full C++17-upgrade for C++11/14. Starting with v1.1.0 these are supported when compiling ghc::filesystem under C++17 of C++20.

Starting with v1.5.2 ghc::filesystem will try to allow the use of std::experimental::basic_string_view where it detects is availability. Additionally if you have a basic_string_view compatible c++11 implementation it can be used instead of std::basic_string_view by defining GHC_HAS_CUSTOM_STRING_VIEW and importing the implementation into the ghc::filesystem namespace with:

namespace ghc {
    namespace filesystem {
        using my::basic_string_view;
    }
}

before including the filesystem header.

Differences in API

To not depend on any external third party libraries and still stay portable and compact, this implementation is following the "UTF-8 Everywhere" philosophy in that all std::string instances will be interpreted the same as std::u8string encoding wise and as being in UTF-8. The std::u16string will be seen as UTF-16 and std::u32string will be seen as Unicode codepoints. Depending on the size of std::wstring characters, it will handle std::wstring as being UTF-16 (e.g. Windows) or char32_t Unicode codepoints (currently all other platforms).

Differences of Specific Interfaces

Starting with v1.5.0 ghc::filesystem is following the C++17 standard in using wchar_t and std::wstring on Windows as the types internally used for path representation. It is still possible to get the old behavior by defining GHC_WIN_DISABLE_WSTRING_STORAGE_TYPE and get filesystem::path::string_type as std::string and filesystem::path::value_type as wchar_t.

If you need to call some Windows API, with v1.5.0 and above, simply use the W-variant of the Windows-API call (e.g. GetFileAttributesW(p.c_str())).

:information_source: Note: _When using the old behavior by defining GHC_WIN_DISABLE_WSTRING_STORAGE_TYPE, use the path::wstring() member (e.g. GetFileAttributesW(p.wstring().c_str())). This gives you the Unicode variant independent of the UNICODE macro and makes sharing code between Windows, Linux and macOS easier and works with std::filesystem and ghc::filesystem._

std::string path::u8string() const;
std::string path::generic_u8string() const;
vs.
std::u8string path::u8string() const;
std::u8string path::generic_u8string() const;

The return type of these two methods is depending on the used C++ standard and if GHC_FILESYSTEM_ENFORCE_CPP17_API is defined. On C++11, C++14 and C++17 or when GHC_FILESYSTEM_ENFORCE_CPP17_API is defined, the return type is std::string, and on C++20 without the define it is std::u8string.

Differences in Behavior

I created a wiki entry about quite a lot of behavioral differences between different std::filesystem implementations that could result in a mention here, but this readme only tries to address the design choice differences between ghc::filesystem and those. I try to update the wiki page from time to time.

Any additional observations are welcome!

fs.path (ref)

Since v1.5.0 the complete inner mechanics of this implementations fs::path where changed to the native format as the internal representation. Creating any mixed slash fs::path object under Windows (e.g. with "C:\foo/bar") will lead clean path with "C:\foo\bar" via native() and "C:/foo/bar" via generic_string() API. On all platforms redundant additional separators are removed, even if this is not enforced by the standard and other implementations mostly not do this.

Additionally this implementation follows the standards suggestion to handle posix paths of the form "//host/path" and USC path on windows also as having a root-name (e.g. "//host"). The GCC implementation didn't choose to do that while testing on Ubuntu 18.04 and macOS with GCC 8.1.0 or Clang 7.0.0. This difference will show as warnings under std::filesystem. This leads to a change in the algorithm described in the standard for operator/=(path& p) where any path p with p.is_absolute() will degrade to an assignment, while this implementation has the exception where *this == *this.root_name() and p == preferred_separator a normal append will be done, to allow:

fs::path p1 = "//host/foo/bar/file.txt";
fs::path p2;
for (auto p : p1) p2 /= p;
ASSERT(p1 == p2);

For all non-host-leading paths the behavior will match the one described by the standard.

Open Issues

Windows

Symbolic Links on Windows

As symbolic links on Windows, while being supported more or less since Windows Vista (with some strict security constraints) and fully since some earlier build of Windows 10, when "Developer Mode" is activated, are at time of writing (2018) rarely used, still they are supported wiit th this implementation.

Permissions

The Windows ACL permission feature translates badly to the POSIX permission bit mask used in the interface of C++17 filesystem. The permissions returned in the file_status are therefore currently synthesized for the user-level and copied to the group- and other-level. There is still some potential for more interaction with the Windows permission system, but currently setting or reading permissions with this implementation will most certainly not lead to the expected behavior.

Release Notes

v1.5.15 (wip)

v1.5.14

v1.5.12

v1.5.10

v1.5.8

v1.5.6

v1.5.4

v1.5.2

v1.5.0

v1.4.0

v1.3.10

v1.3.8

v1.3.6

v1.3.4

v1.3.2

v1.3.0

v1.2.10

v1.2.8

v1.2.6

v1.2.4

v1.2.2

v1.2.0

v1.1.4

v1.1.2

v1.1.0

v1.0.10

v1.0.8

v1.0.6

v1.0.4

v1.0.2

v1.0.1

v1.0.0

This was the first public release version. It implements the full range of C++17 std::filesystem, as far as possible without other C++17 dependencies.