Open maxbachmann opened 1 year ago
Yeah, I don't think we want to bother with making them required attributes, so I'd add:
sys.implementation._architecture
and make it the value of $(ArchName)
from python.props (this approx. matches platform.architecture)sys.implementation._api_partitions
and make it a tuple of all the ones we test for. No need to add names, simple in
checks will be the most common useIs sys.implementation
the right place? It should have things related only to the Python implementation, that other implementations also have. The architecture is a build characteristic and the api partitions seem like it as well. Where do we normally record build or platform (or OS) information?
The only real alternative is at the top level of the sys
module. Maybe you could make a case for it going into sysconfig
, but we still need an attribute on a native module somewhere, as it's the only way we're getting the field it at build time.
I assume this hasn't come up before because Windows is the only OS with transparent CPU emulation? So querying the current OS architecture isn't sufficient to determine the current process architecture.
sys._build.*
?
Really it should be sys.platform_but_for_real_this_time
as we can't just go updating sys.platform
any more.
What if we added sys.architecture
for all platforms? platform.architecture(sys.executable)
is too limited in its return values to work here, but it ought to be known at compile time, right? And something like Jython or IronPython can use clr
or jvm
if they prefer. Maybe they've already got an additional field for that?
the platform as 32-bit vs 64-bit vs ARM64.
There are multiple existing APIs to do that. Examples:
>>> platform.architecture()
('64bit', 'ELF')
>>> os.uname().machine
'x86_64'
I don't think that we should add information retrieved at runtime in sys. Usually, sys is more information computed during the build.
platform.architecture()
doesn't distinguish between ARM64 and x64, and os.uname().machine
doesn't reflect how CPython was compiled. So we need to store the target architecture at build time.
But you're correct that we don't need to retrieve it at runtime. (Though worth noting that a lot of sys
is dynamic - path, modules, various hooks, etc.)
Hmmm. Actually, the discussion in the PR failed because no one decided whether to merge it or not (it seems that the only supporter is @zooba). The newly added arch is used to get the current build architecture in Windows (it comes from python.props). What do you think?
link: https://github.com/python/cpython/pull/124582 https://discuss.python.org/t/regarding-whether-we-should-add-py-currentarch-or-py-archname-function/65191/17
My main concern was/is that this information doesn't belong on sys.implementation
. That's meant for information about the Python implementation (e.g. cpython, pypy, jython, micropython), not about the build. See PEP 421.
Something like sys._architecture
, sys.build.architecture
, or platform.???
or on sysconfig
(like @zooba suggested) would be more appropriate. (FWIW, I like the idea of sys.build.*
, PEP-421-style, but its hypothetical contents might already be covered elsewhere.)
All that said, I haven't looked closely at what we're trying to actually accomplish here, so I don't have much feedback on that.
I'm often the only supporter of things involving Windows :-) Usually aggregated from a lot of discussions with a range of users, but not easily cited (many Windows users are scared off by the culture here, so they won't report stuff themselves).
In this case, we've been looking at adding support to project's build scripts to allow building on Windows ARM64, which requires the builder to know whether it should be targeted ARM64 or not. That's usually done by running in the ARM64 interpreter, and then to cross-compile you'd run in the x64 or x86 interpreter, but we don't currently have a reliable way to detect that. Historically we haven't needed it, because sys.maxsize
(or sys.maxint
) was sufficient to detect 64-bit vs 32-bit. But that can't tell the difference between ARM64 and AMD64.
More importantly, and relevant for this case, is that platform.architecture()
can't return the difference, even if it could detect it, because it's defined as only returning "64" or "32". And PEP 421 explicitly suggests sys.implementation
is the place to store information that would be returned from platform
about the Python implementation. The architecture the current runtime was compiled for sure seems to be related to the implementation, and since platform
would be the obvious place to return such information (as it already is, albeit insufficiently), adding the field to sys.implementation
seems fine.
Having now re-read that PEP though,[^1] I think we should drop the underscore. Make it sys.implementation.architecture
and if it exists then platform.architecture(sys.executable)
can return it rather than querying the actual binary (if it's within the range of allowable return values, and perhaps we can later add platform.architecture_but_for_real_this_time()
)
[^1]: Specifically Non-required Attributes and Use Cases, the platform and Jython use cases being most relevant.
platform.machine()
is another possible place for such information. It already has Windows specific code to detect the architecture.
platform.processor()
also seems like a good candidate. There's also a handful of platform.win32_*()
APIs. Perhaps a similar one could be added. I agree with Eric and others that sys.implementation
seems unfit for this API.
platform.machine()
and platform.processor()
are for different purposes. We want to know the compile-time target of the runtime, which is unrelated to the current machine type (Windows has emulation on most architectures, so the machine could be ARM64 but the runtime is x86 and there's currently no way to determine this).
I agree. If I remember correctly, the win kernel is actually simulating multiple subsystems at the same time (such as win32. It's used to be compatible with historical calls). Maybe when building, it's in win32
, but in fact it's in amd64
. Maybe it‘s in arm64
. These can only be obtained in the initial state at build time.
OK, I think I understand Steve's point of view now. There's precedent for this in the stdlib: we're using sys.implementation
in platform
here: https://github.com/python/cpython/blob/10c4c95395771fb37e93811aaace42f026c16de5/Lib/platform.py#L889
I'm only comfortable with sys.implementation._architecture
if it's an internal thing used only for platform
, instead of a public interface--then, it really is an implementation detail, and sort of fits there.
Alternatively, we could add a public sys.architecture
that also supports other systems, rather than just Windows. It doesn't seem like it would be all that difficult to just define ARCH_NAME
as something else for Linux builds.
I think that for sys.architecture
, a new PR can be submitted after the _architecture
attribute is added. This issue may be put on hold for now. In fact, if possible, the value obtained by sys.architecture
should be the same as the _architecture
attribute.
If we go for sys.architecture
, then there's no need for an _architecture
attribute at all.
As far as the discussion is concerned, it seems that only the _architecture
attribute is what we need, because other platforms are not actually as complicated as windows.
My main hesitation with defining it as public sys.architecture
is that now we have to figure out sensible values for all the other platforms as well... what do we put for WASI? How do we under-specify it enough that we can expand to new platforms later on and not run into platform.architecture
type issues (or sys.platform
type issues)?
Platforms that use our configure
build system (that is, most platforms) can retrieve compile time stuff via sysconfig
APIs.
The sysconfig
module sounds like a good place to add the build-time architecture on Windows.
The sysconfig
module doesn't have a native component, and we don't have a compile-time generated file that ships with the runtime (yet). sysconfig
retrieves compile-time stuff on Linux because we include a copy of Makefile
, and it scans it.
Storing a single value somewhere in a native module is much easier on Windows than adding an entirely new static file that is generated at compile time. The question is about where we should store the value, more than where it gets retrieved from.
The
sysconfig
module doesn't have a native component, and we don't have a compile-time generated file that ships with the runtime (yet).sysconfig
retrieves compile-time stuff on Linux because we include a copy ofMakefile
, and it scans it.
Actually, I was just surprised to learn that it looks like a _sysconfig
module has snuck in via gh-88402/GH-110049[^1].
[^1]: And to be clear, I consider that a good thing :). More of sysconfig
/_sysconfigdata
should probably make its way there eventually.
IMO, this should be implemented by adding MULTIARCH
— which represents a target triple — in sys.get_config_vars()
on windows, while we don't have a better API. I don't think sys.implementation
is the right place, this is a build detail, not an implementation-specific detail.
I was just surprised to learn that it looks like a
_sysconfig
module has snuck in
Ah, so it has.
this should be implemented by adding
MULTIARCH
— which represents a target triple
Given that we can embed it at compile time, I'm fine with this. Do we have official target triples for Windows? Or do we just borrow them from someone else (I'm pretty sure Rust uses them, yeah?)
I actually have no problem with adding the arch member to MULTIARCH
, but would that be better than adding it to the implementation
? For example, in terms of declarative?
@zooba, what do you think about this?
would that be better than adding it to the
implementation
? For example, in terms of declarative?
It would be added to the Modules\_sysconfig.c
file (which I'd forgotten about), so it's easy enough to add. It's a bit rough on users, but at least it'll be possible to get at the information.
We need a spec for MULTIARCH
though, to define exactly what the values should look like.
That is, it should look like this
>>> import _sysconfig
>>> _sysconfig.config_vars()
{'EXT_SUFFIX': '.cp314-win_amd64.pyd', 'SOABI': 'cp314-win_amd64', 'Py_GIL_DISABLED': 0, 'ARCH': 'win32'}
Edit (or maybe I didn't output the value correctly).
It would be useful to be able to detect: 1) the platform as 32-bit vs 64-bit vs ARM64. 2) the windows api partition
@zooba you mentioned adding them to
sys.implementation
. Since they are only available on windows, I assume this would mean adding them with an underscore prefix? The alternative would be a separate API similar tosys.winver
.This is a follow up to #102256.
Linked PRs