fsspec / universal_pathlib

pathlib api extended to use fsspec backends
MIT License
251 stars 44 forks source link

UPath incorrectly creates a parent object after calling UPath.parents method #63

Closed vilozio closed 2 years ago

vilozio commented 2 years ago

The bug appears when you try to call parents method of a path with an URL schema prefix like gs://.

Way to reproduce

from upath import UPath

path = UPath('gs://my-bucket/dir1/dir2/')  # Create a path object.
parent = path.parents[0]                   # Get any parent. The object is missing _url attribute.
# Next line will throw an error because the object is incorect.
str(parent)
Traceback Traceback (most recent call last): File "C:\Users\korotkovk\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in str(a) File "C:\Users\korotkovk\anaconda3\lib\pathlib.py", line 723, in __str__ self._str = self._format_parsed_parts(self._drv, self._root, File "C:\Users\korotkovk\anaconda3\lib\site-packages\upath\core.py", line 149, in _format_parsed_parts scheme, netloc = self._url.scheme, self._url.netloc AttributeError: 'NoneType' object has no attribute 'scheme'

Possible solution

The parents method is implemented in the PurePath class. There _PathParents class is used for the parents sequence, which has the __getitem__ method, and it doesn't provide "required" url parameter into the _from_parsed_parts method:

# Code block from pathlib.
def __getitem__(self, idx):
    if idx < 0 or idx >= len(self):
        raise IndexError(idx)
    return self._pathcls._from_parsed_parts(self._drv, self._root,     # <---- doesn't give the url param, which leads to an error.
                                            self._parts[:-idx - 1])

I suggest that parents method may be inherited inside UPath to fix this error.