Kozea / Pyphen

Hy-phen-ation made easy
https://courtbouillon.org/pyphen
Other
198 stars 24 forks source link

Sorting items obtained from importlib.resources.files() might not always be supported #54

Open jhbuhrman opened 1 year ago

jhbuhrman commented 1 year ago

Version: pyphen 0.14.0 Imported from: weasyprint==52.5

Other versions:

Assuming that Pyphen is trying to support pre-built, pre-compiled, and/or shrink-wrapped Python-based applications (think PyInstaller, py2exe , or Nuitka), it tries to locate resource- or data-files using Python's stdlib resources.files functionality. This is of course a Good Thing™ in itself. However, Pyphen assumes that objects returned from its .iterdir() method can be compared using the < operator, since it calls sorted() on the .iterdir() result, but the Traversible ABC does not guarantee that objects returend by .iterdir() can be compared using the < operator.

When importing Pyphen, in a Nuitka-built program, the following error message is shown:

  File "[...]\standalone\build\manage.dist\pyphen\__init__.py", line 37, in <module pyphen>
TypeError: '<' not supported between instances of 'nuitka_resource_reader_files' and 'nuitka_resource_reader_files'

The following patch will solve this problem:

--- a/pyphen/__init__.py
+++ b/pyphen/__init__.py
@@ -30,11 +30,11 @@

 try:
     dictionaries = resources.files('pyphen.dictionaries')
 except (AttributeError, TypeError):
     # AttributeError with Python 3.7 and 3.8, TypeError with Python 3.9
     dictionaries = Path(__file__).parent / 'dictionaries'

-for path in sorted(dictionaries.iterdir()):
+for path in sorted(dictionaries.iterdir(), key=Path):
     if path.suffix == '.dic':
         name = path.name[5:-4]
         LANGUAGES[name] = path

Please note that I also have tried wrapping the resources.files() result into importlib.resources.as_file, but that didn't help. I don't know whether that should be attributed to Python or Nuitka. Probably the last one, since there's also another issue with Nuitka that it is not able to open the resulting path regardless.

As a final note: is this sorting of hyphenation file names needed anyways?

jhbuhrman commented 1 year ago

FYI, there's another issue with using importlib.resources.files(), when combined with Nuitka in "Standalone" mode, see https://github.com/Nuitka/Nuitka/issues/2393.

kayhayen commented 1 year ago

The Nuitka issue is closed, I would assume this can be closed. I am not partaking here though, so if there is any other issue with Nuitka, report it to me in its tracker.

jhbuhrman commented 1 year ago

The Nuitka issue is closed, I would assume this can be closed. I am not partaking here though, so if there is any other issue with Nuitka, report it to me in its tracker.

The Nuitka issue that is closed, does not solve this particular problem, I believe, or I may be misunderstanding. I've created another Nuitka issue that might provide a work-around for this very issue, ref. https://github.com/Nuitka/Nuitka/issues/2400. As stated there already:

[...] I still think that one should not rely on functionality that appears to be working but is not backed by documentation describing that it should.

So as a concluding note, from a perhaps more "puristic" point of view and trying to be as portable as possible between varying Python "distributions", I would vote for providing a more portable solution in Pyphen itself.

liZe commented 1 year ago

As a final note: is this sorting of hyphenation file names needed anyways?

It is required to always have the same dictionaries for short names.

The following patch will solve this problem:

If that code works, would you like to open a PR? Thanks!