openframeworks / openFrameworks

openFrameworks is a community-developed cross platform toolkit for creative coding in C++.
http://openframeworks.cc
Other
9.98k stars 2.55k forks source link

of::filesystem::path with implicit narrow string conversion #8162

Open artificiel opened 3 weeks ago

artificiel commented 3 weeks ago

FOR DISCUSSION

[edited (strikethrough vs bold) to take into account the progress done vs discussions below]

I’m not sure what is the current "future view" of of::filesystem::path (topic is addressed in a few different PR/issues) but after considering the fact that MSVC does not provide an implicit narrow string conversion operator, and considering we are already importing std::filesystem as of::filesystem, the idea of overlaying a custom of::filesystem::path, based exactly on std::filesystem::path interface but adding the «missing» conversion, sounds feasible. [edit: the proposal morphed from the initial of::filesystem clone of std::filesystem, to embracing std::filesystem and provide a single class of::filesystem::path that covers the missing implicit conversion on windows]

The goal is to support MSVC where std::string is implied in variables or arguments, without arguing against the design of std::filesystem or ::path, and letting modernization occur in parallel with backwards-compatibility.

I unfortunately am not in a position to try things in MSVC so it’s not the most productive process, but this PR implements such a «wrapper» class, where string and wstring can be thrown in & out of of::filesystem::path implicitly and explicitly. [edit: also our methods take and produce of::filesystem::path, which can accept a std::filesystem::path, and will implicitely convert to std::filesystem::path (bascially extracting the underlying path_) when dropped into std::filesystem::exists() et al. this leverages all the work already done to support of::filesystem::path]

Note: the idea of composing with a private std::filesystem::path path_ has been made for 2 reasons: (1) std stuff is generally not inheritance-friendly and (2) have the path inheriting path might cause confusion — having a member with a fully-qualified, private std::filesystem::path type makes things unambiguous.

Note2: it's a bit cumbersome as it's mostly boiler-plate (and maybe some stuff is still missing; this is for discussion/experimentation), but allows the flexibility of augmenting ::path with useful features in the OF context [edit: this is now discouraged, the idea is to be as 1:1 as possible with std::filesystem::path], while maintaining compatibility with any generic std::filesystem::path documentation and examples.

If someone with MSVC can try this and see what happens it would be great (some methods involving ofFile had to be tweaked, and for some reasons comparison operator are finicky too, if you get errors please check if it's only missing a bit of API)

Also, in ofConstants.h I simply inserted the mod at the place where the various #ifdef are true in my position; please double-check you end up in the same spot.

[edit]: a possible ofConstants.h implementation:

namespace of::filesystem {
#if defined(win_32) and defined(OF_TOLERANT_NARROW_PATH_CONVERSION)
// custom class here
#else
using std::filesystem::path
#endif
}

test snippet:

    auto path = ofToDataPath("path"); // defaults to an actual of::filesystem::path
    ofLogNotice("path") << path;

    std::string stringpath = ofToDataPath("stringpath"); // <- converesion should work on MSVC
    ofLogNotice("stringpath") << stringpath;

    // std::wstring wstringpath = ofToDataPath("wstringpath"); // <- conversion now works on Clang/macOS // edit: not in scope

    of::filesystem::path p2 {wstringpath};
    ofLogNotice("p2") << p2;

    std::string str;
    std::transform(wstringpath.begin(), wstringpath.end(), std::back_inserter(str), [] (wchar_t c) {
        return (char)c;
    });
    ofLogNotice("wstringpath") <<str
artificiel commented 3 weeks ago

NOTE: Core OF compiles in VS CI; some ofxTests are failing when comparing empty «default directory» paths, but it might be due to transitionary process within ofFileUtils. Before digging into that it would be preferable to assess if this approach is worthwhile. Also seems wcstombs may be unsafe but that’s a detail to be fixed if it is decided to go ahead with this.

artificiel commented 3 weeks ago

Also, i guess this raises another question: do we need to import std::filesystem as of::filesystem? it seems like "granted" but is there a real reason for it? (i guess during the boost/experimental transition it made sense). a quick github search reveals no adoption of of::filesystem outside of OF core.

we could just have an of::filesystem::path class, that interacts nicely with std::filesystem::stuff but not overlayed in the same namespace?

the advantage of not overlaying is that it's just a new class and no funny using stuff, and in code it would look like std::filesystem::exists(path) where path could be an of::filesystem::path thanks to it's conversion, and it does not create an ambiguous documentation area — we just have to say "our path works and masquerades as a std::filesystem::path plus it implicitely converts to narrow in windows for backwards-compatibility".

[edit to add]: on non-windows, of::filesystem::path could be a direct alias of std::filesystem::path, diminishing further our surface of cases to maintain (i made the class above able to convert to wstring on mac, but that's probably superfluous).

@dimitre @oftheo @NickHardeman @roymacdonald @danoli3

ofTheo commented 3 weeks ago

Thank you so much @artificiel - this is really exciting to see.

I do like the idea of of::filesystem::path being its own thing. I think the danger will always be if people try and treat it like a std::filesystem::path but we have not caught all the edge cases.

That said if its clearly its own thing which can return a string, a wstring and a std::filesystem::path then maybe if you need the std::filesystem::path functionality you can always do that explicitly.

Are you mimicking all the std::filesystem::path functionality? I do like that this could get us to a workable solution for Windows and switching completely over to of::filesystem::path.

I'd be curious what @dimitre thinks.

A few errors in the tests:

This is the error on emscripten:

../../../libs/openFrameworks/utils/ofFilesystem.h:183:45: error: call to implicitly-deleted default constructor of 'std::hash<std::filesystem::path>'
        std::size_t hash() const noexcept { return std::hash<std::filesystem::path>()(path_); }
                                                   ^
/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1/__functional/hash.h:813:36: note: default constructor of 'hash<std::filesystem::path>' is implicitly deleted because base class '__enum_hash<path>' has a deleted default constructor
struct _LIBCPP_TEMPLATE_VIS hash : public __enum_hash<_Tp>
                                   ^
/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1/__functional/hash.h:807:5: note: '__enum_hash' has been explicitly marked deleted here
    __enum_hash() = delete;

And it looks like some of the filesystem tests are failing on linux.

  [  error ] ofToDataPath relative failed 
  [  error ] test_eq(ofToDataPath("",false), "data/")
  [  error ] value1: ofToDataPath("",false) is data/
  [  error ] value2: "data/" is data/
  [  error ] src/main.cpp: 211
  [ notice ] 
  [ notice ] tests #4462
  [ notice ] absolute ofToDataPath with / should end in / passed
  [ notice ] absolute ofToDataPath without / should not end in / passed
  [ notice ] absolute ofToDataPath with / should end in / passed
  [ notice ] absolute ofToDataPath without / should not end in / passed
  [ notice ] 
  [ notice ] tests #4598
  [ notice ] ofToDataPath with empty string shouldn't crash passed
  [ notice ] 
  [ notice ] tests #4563
  [  error ] #4563 test1 failed 
  [  error ] test_eq(ofToDataPath("a.txt"), "data/a.txt")
  [  error ] value1: ofToDataPath("a.txt") is data/a.txt
  [  error ] value2: "data/a.txt" is data/a.txt
  [  error ] src/main.cpp: 242
  [  error ] #4563 test2 failed 
  [  error ] test_eq(ofToDataPath("data.txt"), "data/data.txt")
  [  error ] value1: ofToDataPath("data.txt") is data/data.txt
  [  error ] value2: "data/data.txt" is data/data.txt
  [  error ] src/main.cpp: 243
  [  error ] #4563 test3 failed 
  [  error ] test_eq(ofToDataPath(""), "data/")
  [  error ] value1: ofToDataPath("") is data/
  [  error ] value2: "data/" is data/
  [  error ] src/main.cpp: 244
artificiel commented 3 weeks ago

@ofTheo: it's at the point that OF itself and common/example apps compiles. the API is not so complex, and the data somewhat immutable-ish (very few operations change the actual path). so, assuming someone with MSVC takes this and compiles stuff and tests the limits and we decide to go forward with this approach, i'm not seeing so many edge cases we won't catch. and concretely in the OF context, we are targeting the transparent upgrade of std::string, more than revamping current filesystem usage, which has not yet made it's way into user code.

about the failing tests: some are involving the empty string "" and for now i worked around that by changing the comparisons to .empty() instead. but some look like they should not fail ex:

 error ] #4563 test2 failed 
  [  error ] test_eq(ofToDataPath("data.txt"), "data/data.txt")
  [  error ] value1: ofToDataPath("data.txt") is data/data.txt
  [  error ] value2: "data/data.txt" is data/data.txt

value1 and 2 look the same... but probably the types are differing and operator == is not happy (the only area that is somewhat difficult to handle was the comparison operators, i've done some back&forth against const char * maybe it will need a bit of enable_if). definitely ibetter if someone with MSVC attempts it as it's a bit boring to wait out github CI cycles to read through VS error logs!

but again, if the design decision to go ahead with something like this is taken, i'm sure these kinks will be fixable.

ofTheo commented 3 weeks ago

I do notice that paths print out via cout with quotes vs string. So the same path might print out: "data/myPath.txt" and data/myPath.txt depending on whether it's a string or path.

Not sure if that is the reason the test_eq is failing but agree most likely fixable.

@roymacdonald, would you feel like giving this PR a whirl on Windows?

NickHardeman commented 3 weeks ago

@artificiel, thank you. I like the idea of OF having a path class for extra functionality or convenience.

Just wanted to point out that I was able to get the loading of file paths with utf-strings sorted on Windows for image loading by simply dropping the const and reference (&) from function arguments; so (const of::filesystem::path& path) becomes(of::filesystem::path path). Maybe the const qualifier was restricting the conversion?

See https://github.com/openframeworks/openFrameworks/pull/7435#issuecomment-2438749436

artificiel commented 3 weeks ago

ok compiles clean everywhere, this should be tested in actual MSVC with various real-world cases, keeping in mind that some of the path API might be still to be implemented or might need tweaks -- this PR is not to be considered final, but a functional proof-of-concept, in order to make the decision on how to move forward.

@NickHardeman i had to mess around to with const and passing through return values from internal calls; let's see where this leads. i'm now 50/50 on "adding functionality" as per @ofTheo's comment, on macOS/linux of::filesystem::path could be literally std::filesystem::path, reducing surface, and only on windows we slide the actual of::filesystem::path (which could be simpler, no windows #ifdef as it would be #ifdef'd only on windows).

but all this for discussion, i'm now curious to see how it behaves in real apps.

ofTheo commented 3 weeks ago

Amazing @artificiel !

One further thing to consider if we are going from of::filesystem::path being an alias for std::filesystem::path to this OF class which handles string wstring and paths.

Should we just build on ofFilePath which is already there and currently just has a bunch of static functions?

So auto path = ofToDataPath("path"); would be ofFilePath path = ofToDataPath("path");

Just a thought as I think of::filesystem::path makes more conceptual sense as an alias for std::filesystem::path. And also if switched the functional class you made to ofFilePath then of::filesystem::path could remain as a simple alias which might be handy in some edge cases.

image

EDIT: If we did go this route the arguments to ofFilePath static functions and the returns could be ofFilePath so that way being wstring / path and string friendly.

artificiel commented 3 weeks ago

what i like about of::filesystem::path masquerading std::filesystem::path is that all the docs tutorials and examples using std::filesystem::path will be applicable 1:1. we just "augment" the API on windows with an implicit narrow string converter, and allow our methods to take a std::filesystem::path via implict conversion to of::filesystem::path and if you don't funnel your paths in narrow std::string you can use std::filesystem::path cross platform as-is (copy paste code etc.).

this is considering the scope is to transparently "fix" MSVC's path implementation by allowing std::string str = ofToDataPath("a"); on windows without the users having to do anything on their existing code.

ofTheo commented 3 weeks ago

@artificiel - just to confirm does that mean keeping of::filesystem::path as it is in this PR and then implementing all the std::filesystem::path functions?

artificiel commented 3 weeks ago

well the implementation is already handled by the actual std::filesyste::path class as (except the narrow conversion) all methods are forwarded to the private path_. right now there's #ifdef's win32 in there, but ultimately it would be windows-only, also simplifying ofConstants.h:


namespace of::filesystem {
#ifdef win_32 
// custom class her
#else
using std::filesystem::path
#endif
}
ofTheo commented 3 weeks ago

oh I see okay! that makes sense.

so in theory all the placed in this PR where you have of::filesystem::path changed to std::filesystem::path, would be unneeded in the final PR?

artificiel commented 3 weeks ago

I updated the top comment to summarize the end state of the PR

dimitre commented 2 weeks ago

I think it is great if helps the transition from string to fs::path. I would suggest we keep importing filesystem to of::filesystem as it is, and this project be named of::path. when it is not needed it is an alias to of::filesystem::path.

artificiel commented 2 weeks ago

@dimitre your point makes it a good to summarize (I understand the comment thread here is a bit messy!) so there are 2 independent questions:

in essence, I focus my view on the notion of removing as much as possible surface, unless required.

dimitre commented 2 weeks ago

now std::filesystem is mature we can surely stick to it if we choose OF to be c++17 or newer. of::filesystem was just a way of aliasing the possible filesystem in older computers, but yes we have the previous releases compatible with c++11/14

GitBruno commented 1 week ago

Can we add std::string as input path as well.

ofToDataPath(const fs::path & path, bool makeAbsolute)
ofToDataPath(const std::string & path, bool makeAbsolute)

We just want to type like the old days...

ofToDataPath("myFolder/myFile.ext");

Happy to add a PR btw let me know

artificiel commented 1 week ago

@GitBruno the current git/nightly already implements ofToDataPath() as:

std::string ofToDataPath(const of::filesystem::path & path, bool absolute = false);

as both std::filesystem::path and this PR's of::filesystem::path take string, wstring, and char * as constructor args. in other words your request is already working.

the real feature here is to change the return type to of::filesystem::path

of::filesystem::path ofToDataPath(const of::filesystem::path & path, bool absolute = false);

in a manner that will implictely convert to narrow std::string in Windows (which the strict implementation of MSVC does not).

auto aaa = ofToDataPath("yes.txt"); // aaa is of::filesystem::path everywhere
std::string bad = ofToDataPath("no.txt"); // FAILS in windows without this PR
std::string good = ofToDataPath("yes.txt"); // works in windows with this PR

the correct way of working with filesystem path representation in a complete safe crossplatform manner is to adopt ::path and never trying to pass paths as std::string as windows might be unable to convert some wide chars — but obviously existing code won't rewrite itself autonomously, so we have to support it.

it's unfortunate that MS did not envision the crossplatformability of std::filesystem::path with flexibility in mind (i.e. they don't consider code written for windows to be useable elsewhere) but this PR's approach seems like a good compromise.

BTW if you are in a position to test this branch on actual windows it would be great, even more so if you happen to use unicode files and directories! any code that represent filesystem paths as std::string (i.e. most of addons) should interact "as is" with the OF core.

GitBruno commented 1 week ago

Yes the current implementation is a problem for me as of::filesystem::path might take a string but my compiler doesn't know that. So I cannot pass a string into ofToDataPath which expects a of::filesystem::path:

Current Implementation as mentioned:

std::string ofToDataPath(const of::filesystem::path & path, bool absolute = false);

Therefore I was suggesting to add the following overload function:

std::string ofToDataPath(**std::string path**, bool makeAbsolute = false);

of::filesystem::path might have an overloading operator for std::string() but std::string does not have an overloading operator for of::filesystem::path().

On a side note, it might also be good to be able to write like this:

ofToDataPath("my.file")  // Getting a proper path back is great btw :)

However with a const& I'll have to resort to this:

std::string sFilename = "my.file"
ofToDataPath(sFilename)

As far as I know the reference is not used as a reference and get copied anyway? So I do not see any benefits to use a reference here.

artificiel commented 1 week ago

@GitBruno just to be sure, if you are having a problem with what is in the current branch (not specifically this PR) please post a freestanding issue as this PR is only adding an implicit conversion operator for std::string to of::filesystem::path, which is unrelated to the behaviour of the type entering ofToDataPath().

(to clarify things, outside of this PR, of::filesystem::path is an alias to std::filesystem::path, and ofToDataPath() has been taking const std::filesystem::path & arg since at least OF 11.2.)