njsmith / posy

287 stars 17 forks source link

Windows arm64 support #2

Open njsmith opened 1 year ago

njsmith commented 1 year ago

Apparently it's a real thing now: https://www.nuget.org/packages/pythonarm64/

Todo:

  1. Build win_arm64 pybis. (Needs access to a Windows arm64 box. I guess the only practical way to do this is to rent one from Azure? Doing that by hand on every Python release sounds super annoying, so probably means writing some scripts.)
  2. Build win_arm64 trampolines. (Or is x86-64 emulation cheap enough that we can just re-use the x86-64 trampolines?)
  3. Fix our Windows platform detection code to work on arm64. (Or maybe it already does? Need to test in any case.)
zooba commented 1 year ago

Cross-compiling on Windows (to another Windows platform) works fine,[^1] so you won't need an ARM64 box except for testing. We're still waiting for GitHub to add them to GHA before we consider it "available" for OSS projects. (All the CPython releases are built on x64 machines, and we have some internal machines at work that I run tests on. For the most part, Windows x64 and ARM64 have been identical.)

I'm not sure which trampolines you need, but we may be able to help with that. Similarly with checking the platform detection code. Rust certainly knows about it, as does the platform module, though the best way to check if a Python runtime was natively built for ARM64 is sys.winver.endswith('-arm64').

[^1]: With the right tools installed. If you're building on GitHub Actions, they're all there.

njsmith commented 1 year ago

My janky code for repacking your builds into pybis requires running some code on the interpreter to extract metadata, so I think I need an arm64 box to run it.

The trampolines are this weird little subproject, where I reimplemented the distlib Python script -> Windows .exe conversion thing: https://github.com/njsmith/posy/tree/main/src/trampolines/windows-trampolines/posy-trampoline

For platform detection, I'm referring to this code, where posy tries to figure out which kinds of binaries can be run on the current computer: https://github.com/njsmith/posy/blob/main/src/platform_tags/windows.rs So e.g. you might have a x86-64 build of posy running on Windows-arm64 under emulation, and it still wants to figure out that this box can run both x86-64 and arm64 Python, and it should probably prefer arm64 by default.

zooba commented 1 year ago

My janky code for repacking your builds into pybis requires running some code on the interpreter to extract metadata

Let me know what metadata you need and I'll see if I can offer a better solution. If you like XML parsing, there's a .props file included that's meant to provide certain properties to MSBuild, so you might be able to extract something from that.

Cross-compiling is a big thing for me right now. There was a discussion a while back about how to provide static metadata with a runtime, but unfortunately it didn't go far enough. But I'm willing to sneak stuff in if it'll help.

The trampolines are this weird little subproject, where I reimplemented the distlib Python script -> Windows .exe conversion thing

Oh fun :) Your description of #![no_std] reminds me of when I was coding in pure assembly (also for fun). The x64 emulation is certainly good enough for you to get away with these as they are, but they look like they'll probably compile fine for ARM64 too. It's only when you start doing assembly or stack munging that you need arch-specific stuff - the API set is equivalent.

BTW, you probably ought to switch to the *W APIs instead of the *A ones (which are all more-or-less deprecated, due to choosing a semi-random character encoding for all their parameters). Having to work in 16-bit characters might be a pain, but at least you won't have lossy conversions to worry about.

For platform detection, I'm referring to this code, where posy tries to figure out which kinds of binaries can be run on the current computer

This looks like the most correct code I'm aware of. Personally I've been leaning towards "if someone is running under emulation, keep running under emulation" and so that x86-64 build would keep generating x86-64 binaries, but I think provided the output is clear enough about what it's doing then it's fine either way.

njsmith commented 1 year ago

Let me know what metadata you need and I'll see if I can offer a better solution

The metadata extraction script is here:

https://github.com/njsmith/pybi-tools/blob/f424d135d9d60aaaea9db5cb78e81f7aec9494ff/pybi.py#L183-L221

It's the same Pybi-* fields in the Pybi metadata spec -- environment markers, wheel install paths, and wheel tags.

BTW, you probably ought to switch to the W APIs instead of the A ones (which are all more-or-less deprecated, due to choosing a semi-random character encoding for all their parameters).

Ah, but I am sneaky:

https://github.com/njsmith/posy/blob/main/src/trampolines/windows-trampolines/posy-trampoline/build.rs

https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

zooba commented 1 year ago

It's the same Pybi-* fields in the Pybi metadata spec -- environment markers, wheel install paths, and wheel tags.

Ah, all packaging stuff. Unfortunately the point of that library is to separate the metadata from the release cycle, so anything I embed into a release would actually be out of date.

Wheel tags should be very predictable (depending on where nogil goes...), but they are unfortunately outside of core "control" these days.

The sysconfig paths will be identical for all platforms - they should all be a single subdirectory each under prefix (except Lib\site-packages), which you are choosing yourself.

Ah, but I am sneaky:

Nice. (I can't wait for Python to drop Windows 8.1 support :) ). Though I don't see any references to the actual setting? Does Rust add this bit by default?

  <application>
    <windowsSettings>
      <activeCodePage xmlns="http://schemas.microsoft.com/SMI/2019/WindowsSettings">UTF-8</activeCodePage>
    </windowsSettings>
  </application>
njsmith commented 1 year ago

Ah, all packaging stuff. Unfortunately the point of that library is to separate the metadata from the release cycle, so anything I embed into a release would actually be out of date.

I think the specific metadata should be pretty stable (e.g. it includes CPython ABI tags but not OS/architecture tags), which is good because it ends up as static metadata in a release :-).

But it's not worth worrying about too much anyway. Cross-compiling is fancy but it's easy enough to borrow a Windows/arm64 box from Azure for a few minutes.

Though I don't see any references to the actual setting? Does Rust add this bit by default?

It's the helper crate I'm using -- it tries to use generally sensible modern defaults: https://docs.rs/embed-manifest/1.3.1/embed_manifest/fn.new_manifest.html