Open garcia opened 2 years ago
Went ahead and started on a proof-of-concept NativeOSFS
implementation for my use case: (gist). Most of the code was just Frankensteined over from OSFS
with path conversion stripped out.
I've only spent like 20 minutes on this and didn't write any tests, but my library's unit tests are back to passing with this class as the default FS, so that's cool. If there's interest in a PR for this, I'd be happy to spend some more time fleshing out write functionality & tests, but I also acknowledge that this sort of goes against pyfilesystem2's design philosophy, so, your call :)
just to expand an existing,
os
-based API to support non-OS filesystems.
That's simply not how PyFilesystem is designed to work - you can't mix os
paths and functions with fs
paths and functions; they're fundamentally two different beasts.
E.g. following on from your opendir(r"C:\Users\micro\Documents\example") # ok
example, what should
mem_fs = MemoryFS()
opendir(r"C:\Users\micro\Documents\example", fs=mem_fs)
be expected to do? And as you mention: "will probably lead to confusion when the paths that come back from my API have different separators than the ones that were passed in" !
I think you either need to stick to using only os
paths and functions, or fully-embrace fs
paths and functions - and yes this would mean breaking backwards-compatibility. All IMHO, of course :wink:
This request feels kinda like a duplicate of #238 ?
I did see #238 while researching this issue. Both this issue and #238 concern compatibility with built-in Python modules, but different modules, with different underlying motivations.
E.g. following on from your
opendir(r"C:\Users\micro\Documents\example") # ok
example, what shouldmem_fs = MemoryFS() opendir(r"C:\Users\micro\Documents\example", fs=mem_fs)
be expected to do?
I wouldn't expect this to work. You would still have to use PyFilesystem paths for nonnative filesystems, and operations like listdir
on such paths would also return PyFilesystem paths, preserving user expectations.
I think you either need to stick to using only os paths and functions, or fully-embrace fs paths and functions - and yes this would mean breaking backwards-compatibility. All IMHO, of course 😉
The reality of API design is a long story of balancing backwards compatibility with simplicity. I admire the aesthetic of a nice & clean API with simple invariants, but fully embracing PyFS requires me to reconcile with the practicality of Windows users pasting paths into their console to explore my library's functionality. So, I'll probably just continue to develop my NativeOSFS
hack as a 3rd party implementation for now.
Appreciate the feedback! Feel free to close if there's nothing actionable here.
Perhaps a "cleaner" approach would be to always use pyfilesystem methods and paths throughout your library (and thus avoid your NativeOSFS
hack), and simply provide a wrapper which converts things to-and-from a "system path" for users who want to do command-line experimentation (using native paths)? :shrug:
converts things to-and-from a "system path"
This is why I pointed out that getsyspath
doesn't appear to have an inverse. I would be interested in a solution that simply translates system paths to PyFS paths for OSFS
, but I couldn't find a way to do that reliably, and I suspect drive letters would make it especially thorny.
In any case, my library isn't doing anything particularly fancy at the filesystem level - it just needs to find some files in a nested directory structure and return their paths / read their contents. Most of my users will likely stick to the OS filesystem, but I don't want this logic to be coupled to a particular FS for those with more advanced use cases. So, either the simple use cases or the advanced use cases will need to bear the friction of Windows paths being weird. I would rather shift that burden to the advanced use cases, since those users are more likely to be familiar with this kind of impedance mismatch.
I would be interested in a solution that simply translates system paths to PyFS paths for OSFS, but I couldn't find a way to do that reliably, and I suspect drive letters would make it especially thorny.
I suspect that the only way to do that reliably would be to only translate system-paths that are in a sub-directory of the OSFS's root directory, and throw an Exception (or create another OSFS?) otherwise? :shrug: I.e. if you want to support multiple drive letters you'd probably need a dict
of OSFSes?
Here's a heavily-simplified example of the API I want to offer my users:
The idea here is that users of my API could pass a native OS path with no
fs
parameter to inherit Python's standard path functionality, or they could specify e.g.fs=ZipFS("foo.zip")
and a path within the ZIP for more advanced use cases. Currently, this API exists as a stable release without anyfs
abstraction, so it's important to preserve existing path semantics to avoid a major version bump.Unfortunately, this hypothetical API doesn't work for a few reasons. Most notably,
OSFS
requires aroot_path
parameter, which acts as both a sandbox and a dividing line between system paths and PyFilesystem paths (as documented here). If I default tofs.os.OSFS("/")
, the API will work for absolute paths, but not for relative paths. If I instead default tofs.os.OSFS(".")
, I get the opposite affordances, except relative paths outside the current working directory still won't work due to the sandbox. This friction is compounded on Windows due to drive letters and the backslash separator.The obvious workaround here is to pass
path
into theOSFS
constructor and set the path within that filesystem to/
:...except now if my user explicitly passes an OSFS, the
path
's semantics will be different:This also complicates chaining within my API (which is obviously more complex than this one trivial function) and will probably lead to confusion when the paths that come back from my API have different separators than the ones that were passed in.
I understand there were two mutually incompatible options for this library: make paths consistent with the end user's expectations, or make paths consistent with each other across platforms for ease of programming. I can empathize with the decision to go with the latter option, but this seems like a lot of friction just to expand an existing,
os
-based API to support non-OS filesystems.Can I convince OSFS to always work in system paths? Maybe as a subclass, or just a separate FS implementation? My ideal OSFS API would take no
root_path
and simply pass system paths down to Python'sos
layer, but it's hard to tell how much work that would be, or whether it's even possible or advisable (one hitch I've noted is thatfs.getsyspath
doesn't appear to have an inverse function).(For the sake of ending on a positive note: I really appreciate the thorough and well-organized documentation, which made understanding the root cause of this friction much easier!)