lebedov / python-pdfbox

Python interface to Apache PDFBox command-line tools.
Other
75 stars 24 forks source link

hint: alternative version parsing implementation #33

Closed mara004 closed 4 weeks ago

mara004 commented 1 year ago

There are some problems with the current version parser, like the version object sorting not handling some edge cases correctly (RC releases are considered higher than alpha releases, but for pdfbox v3 it seems to be the other way round). There are some more minor problems, e.g. calling pkg_resources.parse_version twice (see #29), and pkg_resources being deprecated.

I've written an alternative implementation that takes release date into account: https://gist.github.com/mara004/881d0c5a99b8444fd5d1d21a333b70f8 It doesn't use an HTML parser, but direct regex, which appeared easier here.