fsspec / universal_pathlib

pathlib api extended to use fsspec backends
MIT License
240 stars 41 forks source link

v0.2.2 breaks SMBFilesystem #217

Closed tharwan closed 3 months ago

tharwan commented 4 months ago

hi, we are using upath to access an smb share. Unfortunately v0.2.2 broke the functionality.

It appears that the url components are somehow not passed on correctly.

E.g. UPath("smb://server.com/share/folder/file.csv")

will result in

ValueError: Failed to connect to 'server.comshare:445': [Errno 8] nodename nor servname provided, or not known

Edit: I would be happy and try to contribute a basic SMB implementation, is there any guide or example I can follow?

tharwan commented 4 months ago

I appear I can fix this behaviour by implementing a very simple subclass that returns an absolute path:

class SMBPath(UPath):

    @property
    def path(self):
        return "/" + super().path

The docs from ffspec mention that:

Note: if using this with the open or open_files, with full URLs, there is no way to tell if a path is relative, so all paths are assumed to be absolute.

Which sounds like interpreting every path as absolute should be the default behaviour. Should I maybe file a bug with fsspec instead?

ap-- commented 4 months ago

The fact that it worked with a previous version of UPath suggests that it's a UPath bug which got introduced during the refactor. Ideally we would add a SMBPath implementation to UPath to be able to give more guarantees for all the path methods.

I'm on holiday until the end of May and will then try to find some time to work on setting this up. We would need to implement an SMBPath subclass and the test fixtures to run the base test suite against it.

ap-- commented 4 months ago

Edit: I would be happy and try to contribute a basic SMB implementation, is there any guide or example I can follow?

Sorry, just saw your comment. Thank you for offering to contribute! You would have to create tests in upath/tests/implementations/test_smb.py which requires you to write a fixture that sets up the smb filesystem via a docker container. There is code in the fsspec/filesystem_spec repository that does this. You would then put your minimal implementation in upath/implementations/smb.py and add an entry pointing at your new class into upath/registry.py

This should allow you to run the test suite against your class and make adjustments for various edge cases.

Cheers, Andreas 😊

mraspaud commented 3 months ago

Hi, we have the same problem with an sftp-based filesystem:

path = UPath("ssh://myserver.mydomain.com/home/user1/Documents/hello.txt")
print(path.path)

'home/user1/Documents/hello.txt'

ie the first / is missing.