justindujardin / pathy

simple, flexible, offline capable, cloud storage with a Python path-like interface
Apache License 2.0
170 stars 23 forks source link

Python 3.12 compatibility #106

Closed adrianeboyd closed 8 months ago

adrianeboyd commented 1 year ago

While doing some initial testing for python 3.12 (specifically with python 3.12.0b4), I noticed that pathy isn't compatible out of the box:

$ python -c "import pathy"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/venv312-2/lib/python3.12/site-packages/pathy/__init__.py", line 11, in <module>
    from pathlib import _PosixFlavour  # type:ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ImportError: cannot import name '_PosixFlavour' from 'pathlib' (/home/adriane/local/lib/python3.12/pathlib.py)

(Ugh. I wish this weren't so frustrating and time-consuming every year.)

justindujardin commented 1 year ago

@adrianeboyd Yes, I'm frustrated by the constant breakage as well.

According to #105 there's a public API for doing what Pathy does potentially coming in the future. Still, until then, the team can break things without regard to Pathy since the flavour classes are private.

I love spacy and explosion, and it's been great having you rely on Pathy, but I don't have free time to fix this right now.

adrianeboyd commented 1 year ago

Thanks for your response! I'll talk to Matt and Ines about how they'd like to proceed.

At least for the current release, cloudpathlib seems to have similar problems, so it's not an easy alternative (yet, although it was easy enough to switch in general when I tested it last year).

justindujardin commented 8 months ago

@adrianeboyd There's been movement on #105 and I'm working on an update that supports python 3.12. Are you still waiting on this?

I found that the spaCy model to_disk test fails with the new base library since Pathy will no longer hang off of the pathlib.Path class as a parent. The srsly package has a force_path function that converts to a path if the input is not already a Path type, rather than converting it to a Path only if it's a string type as in spaCy. This has the effect of converting Pathy paths into Paths (which are invalid because gs://bucket/blob isn't a POSIX/Windows path):

def force_path(location, require_exists=True):
    if not isinstance(location, Path):
        location = Path(location)
    if require_exists and not location.exists():
        raise ValueError(f"Can't read file: {location}")
    return location

Beyond the spaCy-specific tests failing, the cleanest migration requires breaking changes to how Pathy works. Specifically, it will no longer directly support interacting with POSIX/windows file system paths. Instead, you must use Pathy.fluid or a pathlib.Path instance directly. I suspect most of the changes are rarely touched by users, but I don't know if they will compromise any uses in thinc/spaCy, so it would be good to check in.

If you're still using Pathy, I expect a PR to be ready for review in the next few days, at which point we could coordinate testing and adjust for your needs. I could resurrect some of the old behaviors if needed, but for now, I'll opt for any chance to simplify and stabilize the library.

adrianeboyd commented 8 months ago

It's good to hear that there are some upstream improvements! spacy (now in weasel rather than directly spacy) migrated to cloudpathlib a few months ago.

justindujardin commented 8 months ago

Cool, thanks for letting me know. I'll drop the spacy tests and close this issue when the new version ships.

github-actions[bot] commented 8 months ago

:tada: This issue has been resolved in version 0.11.0 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket: