fsspec / universal_pathlib

pathlib api extended to use fsspec backends
MIT License
251 stars 44 forks source link

Implement `UPath.resolve` #86

Closed joouha closed 1 year ago

joouha commented 1 year ago

Hello,

This PR attempts to fix issues around parsing URI paths (fixes #85), so now URI paths with double and trailing slashes do not get mangled:

>>> UPath("http://www.example.com//page//page/")
... # Previously returned HTTPPath('http://www.example.com/////page/page')
HTTPPath('http://www.example.com//page//page/')

It also implements UPath.resolve, which normalizes "." and ".." parts in URI paths.

HTTPPath.resolve will additionally follow redirects. This is done by sending HEAD requests (if accepted by the server, falling back to GET requests), returning a new HTTPPath of the final location.

joouha commented 1 year ago

Thanks!

The new test_resolve method in test_http.py does a basic test of the HTTP redirects.

The test http server responds to requests for http://127.0.0.1:8080/folder with a 301 redirect pointing at http://127.0.0.1:8080/folder/ (with a trailing back-slash).

normanrz commented 1 year ago

Thanks again for the PR! I just released and published the new version v0.0.23.