fsspec / universal_pathlib

pathlib api extended to use fsspec backends
MIT License
211 stars 36 forks source link

Override path formatting method for data-URIs #169

Closed joouha closed 4 months ago

joouha commented 6 months ago

This PR fixes an issue where UPath.stat is broken for data-URIs:

>>> from upath import UPath
>>> UPath('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVQI12PYeuECAASTAlbqXbfWAAAAAElFTkSuQmCC').stat()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/josiah/.local/share/hatch/env/virtual/euporie/afiNbXov/euporie/lib/python3.11/site-packages/upath/core.py", line 339, in stat
    return self._accessor.stat(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/josiah/.local/share/hatch/env/virtual/euporie/afiNbXov/euporie/lib/python3.11/site-packages/upath/core.py", line 66, in stat
    return self._fs.stat(self._format_path(path), **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/josiah/.local/share/hatch/env/virtual/euporie/afiNbXov/euporie/lib/python3.11/site-packages/fsspec/spec.py", line 1514, in stat
    return self.info(path, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/josiah/.local/share/hatch/env/virtual/euporie/afiNbXov/euporie/lib/python3.11/site-packages/fsspec/implementations/data.py", line 34, in info
    mime = pref.split(":", 1)[1].split(";", 1)[0]
           ~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

This occurs because fsspec.implementations.data.DataFileSystem.info expects the full URI including the scheme as a parameter, while currently only the path component is passed.

ap-- commented 4 months ago

Thank you for your contribution ❤️

I made a few changes and ported it to the soon to be released UPath version.