keygen-sh / py-machineid

Get the unique machine ID of any host (without admin privileges).
MIT License
56 stars 12 forks source link

[Packaging, Docos, & More] Minor quandaries with the upcoming `machineid 0.4.1` release #6

Closed leycec closed 1 year ago

leycec commented 1 year ago

Thanks so much for the renewed focus on this awesome security-oriented package, @ezekg... and for open-sourcing basically the entirety of keygen.sh. I applaud your forward-thinking bravado that will change the course of security in Python as we know it. I actually mean what I just said.

You dah man and I believe you already know this. Onward to the actual issues!

Quasi-Official Gentoo Support

Firstly, I package for a little-known Linux distro no one has ever heard of called Gentoo Linux. As my own show of affection for what you do, I've begun packaging the entirety of the keygen.sh toolchain for Gentoo at my third-party overlay raiagent. This is me saying:

"Hello and thanks for all the security."

Gentoo users can now install machineid (and eventually everything else) as follows:

eselect repository enable raiagent
emerge --sync raiagent
emerge machineid

This is the way. :fist_raised:

Version 0.4? Where We're Going, There Is No Version 0.4

I couldn't help but notice that the most recent stable release of machineid on PyPI is the two years-old machineid 0.3 but that the next stable release of machineid as demarcated in your setup.py is... machineid 0.4.1?

Is that right? Did machineid 0.4 quietly get released somewhere in the meanwhile but nobody noticed? I'm having unsettling flashbacks to that tree that fell in the forest but nobody was there.

winregistry: The Mandatory Dependency That Hates Linux

setup.py currently requires the Windows-specific mandatory dependency winregistry, which doesn't really make sense for non-Windows platforms like Linux and macOS. Ideally, that dependency should be made conditional on Windows via PEP 496-compliant environment markers.

Although untested, this trivial change should suffice to make that magic happen:

  # Please just change this keyword parameter in "setup.py"...
  install_requires=['winregistry'],

  # ...to resemble this instead. My ancient Linux devbox thanks you.
  install_requires=['winregistry; sys_platform == "win32"'],

My Gentoo ebuild for machineid currently forces that with a sed hack. I sigh. :face_exhaling:

Consider Hatch: Unlike Setuptools, It Doesn't Suck

I recently migrated our team from setuptools to Hatch, which the Python Packaging Authority (PyPA) recently adopted as an official first-party Python project. The big win here is security: declarative Hatch-driven pyproject.toml files are substantially more secure than executable setuptools-driven setup.py scripts, for hopefully obvious reasons.

Although untested, here's how I might define a Hatch-driven pyproject.toml file for this project:

[project]
name = "machineid"

description = """\
Get the unique machine ID of any host (without admin privileges).\
"""

license = { file = "LICENSE" }
readme = "README.md"

#FIXME: Uncomment if you'd like to publish keywords for this project, which you do.
# keywords = ["muh_keyword", "muh_other_keyword",]

classifiers = [
    "Programming Language :: Python :: 3",
    "License :: OSI Approved :: MIT License",
    "Operating System :: OS Independent",
]
authors = [
    { name="Zeke Gabrielse", email="oss@keygen.sh" },
]

# Dynamically harvest the current version for this project from the PEP
# 8-compliant the "__version__" dunder global in your "machineid.__init__"
# submodule, which you (of course) have defined. You have, haven't you? *sigh*
dynamic = ["version"]

requires-python = ">=3.8"
dependencies = [
    'winregistry; sys_platform == "win32"',
]

#FIXME: Uncomment to enable these optional dev extras, if you'd like.
# [project.optional-dependencies]
# all = []
# doc = [
#     "sphinx >=4.2.0",
# ]
# test = [
#     "pytest >=5.4.0",
#     'tox >=3.20.1',
# ]

[project.urls]
"Homepage" = "https://calculion.streamlit.app"
"Documentation" = "https://github.com/keygen-sh/py-machineid/blob/main/README.md"
"Repository" = "https://github.com/keygen-sh/py-machineid"
"Releases" = "https://github.com/keygen-sh/py-machineid/releases"
"Issues" = "https://github.com/keygen-sh/py-machineid/issues"

[build-system]
requires = ["hatchling >=1.14.0"]
build-backend = "hatchling.build"

[tool.hatch.version]

# Relative filename of the Python submodule defining either a PEP 8-compliant
# "__version__" dunder global *OR* a PEP-noncompliant "VERSION" global, which
# Hatch then statically parses to obtain the current version of this project.
path = "machineid/__init__.py"

Fairly confident that suffices. Your mileage may vary, possibly including flaming dumpster fires. :wastebasket: :fire:

PyPI: It's Super-busted, Bro

Relatedly, the PyPI page for machineid is super-busted. Specifically:

The author of this package has not provided a project description :grimacing:

For these reasons (and more), you probably want to add a .github/workflows/release.yml file. Consider copy-and-pasting from @beartype's inspirational .github/workflows/python_release.yml file, which successfully published literally dozens of @beartype releases to GitHub Releases, PyPI, and conda-forge over the past several years for me.

I don't even know how to do those things manually anymore. </gulp>

The Actual Code: I Couldn't Help Myself

I couldn't help but rifle around in the machineid codebase and immediately noticed this suspiciously smelly code smell:

def id(winregistry: bool = True) -> str:
  ...
  if platform == 'darwin':
  ...
  if platform == 'win32' or platform == 'cygwin' or platform == 'msys':
  ...
  if platform.startswith('linux'):
  ...

That's... definitely not quite right. Since the current platform is mutually exclusive of any other platform, every top-level if branch after the first should instead be trivially refactored into an elif: e.g.,

def id(winregistry: bool = True) -> str:
  ...
  if platform == 'darwin':
  ...
  elif platform == 'win32' or platform == 'cygwin' or platform == 'msys':
  ...
  elif platform.startswith('linux'):
  ...

Asperger's: the questionable superpower that just keeps on giving. :vomiting_face:

PEP 561: The Python Standard That Sucks

As the principal maintainer of @beartype, I proudly salute your perspicacious usage of PEP-compliant type hints everywhere. You should know, however, that your type hints are currently being ignored by everybody. Why? Bureaucracy. </facepalm>

You failed to ship an empty and otherwise meaningless PEP 561-compliant machineid/py.typed file. Without that nonsensical file, static type-checkers like mypy and pyright as well as IDEs like VSCode and PyCharm will silently ignore all type hints defined in your package. Yes, this is something awful. On the bright side, at least Python standardized something awful.

To resolve this, just:

touch machineid/py.typed

That's it. Congratulations. You now satisfy PEP 561. This could not be more dumb.

@beartype: The Runtime Type-Checker That Doesn't Suck

Lastly, consider @beartype for all your runtime type-checking needs. Static type-checkers like mypy and pyright don't actually enforce type hints at runtime; @beartype does, enhancing both security and reliability without reducing performance or doing anything else that's bad.

Confession: I write @beartype and am therefore the most biased human who tried to sell you something suspicious today. ...awkward moments on GitHub.

Completely Unrelated: Docos

Long-term, consider painfully rewriting your entire docos for keygen.sh with Sphinx + MyST + the Sphinx Book Theme. Nobody wants to do this but is always glad they did. The existing API docos for keygen.sh do sorta suffice but are kinda awkward and non-standard in navigation, structure, and readability. My eyes bleed when I read them.

Short-term, consider:

Nuitka is mandatory. If security matters enough to use keygen.sh, security matters enough to use Nuitka.

Aaaaaaaand... I'm Spent

That escalated quickly. I accidentally wrote a doctoral thesis on the future of keygen.sh.

This is why my wife yells at me.

ezekg commented 1 year ago

Just to clarify, the external package name is py-machineid, not machineid: https://pypi.org/project/py-machineid/. The machineid name was taken, and we weren't able to get in contact with the owner, so the py- prefix was added.

A Python SDK is on the roadmap, but low priority compared to other languages like Rust and C#.

I'm a Ruby guy, not really a Python guy, so a lot of this goes a bit over my head (build tools).

I'm still digesting the post, but will get back to you soon on the rest.

ezekg commented 1 year ago

Newest version implemented a few of your suggestions, including the py.typed file and Windows-specific dependencies.

Closing since the others aren't particularly needed (e.g. switching build-tooling/type-checker, etc.)

leycec commented 1 year ago

Sure! No worries. Thanks so much for taking the time out of your busy schedule to listen and respond, @ezekg.

That said... switching away from setup.py is definitely needed. setup.py scripts are insecure by definition and likely to be deprecated within the next several years for that very reason. Migration from setup.py to pyproject.toml is not simply desirable; it's necessary. setuptools itself will probably force your organizational hand at some point, is what I'm saying.

The other suggestions are established best practice but also ultimately subjective. If you'd prefer to direct scarce resources elsewhere, that's absolutely understandable. Thanks again for opening all of this awesomeness! :+1:

ezekg commented 1 year ago

Open to PRs, but the rest isn't a high priority right now.