Open Stewori opened 8 months ago
@sobolevn @malemburg according to your posts in #116349 I hope this proposal addresses your API concerns. Please feel free to ping relevant devs for this discussion.
I would suggest using some other name, not vm
, because RustPython
doesn't have a vm, it uses rustc
compiler 🤔
tooling_version()
?
@slozier and @BCSharp , what is your perspective on this for IronPython (please feel free to ping further IronPython devs)?
@sobolevn sure, the name "vm" is placed for discussion. Let's collect a couple of ideas.
@cfbolz @mattip @arigo pinging you just in case there is some point to make from PyPy perspective (not running on a vm, I know).
@jeff5 @fwierzbicki @jimbaker pinging you to make you aware.
@youknowone and @coolreader18 for RustPython
I'd caution against using a tuple as the return value, because it will be difficult to change in the future in a backwards-compatible way. Instead I would suggest a simple object with a documented set of attributes (e.g. using types.SimpleNamespace
).
What about a named tuple? What is the advantage of SimpleNamespace over named tuple?
@sobolevn Regarding RustPython, it occurred to me that there is platform.python_compiler()
. On Linux I get "GCC [version]" as output from CPython. Perhaps that is better suited to expose a rust compiler version and this proposal would really target vm/middleware info. Just thinking...
Namedtuples are still tuples and therefore make it difficult to add new values. You can hack around it by adding a new property that isn't part of the tuple, but that's hacky. For a use case like this the API that tuple provides is mostly not very useful.
Related discussion about deprecating 'Java'
as platform.system()
: https://discuss.python.org/t/lets-deprecate-platform-system-java/48026
Looking at https://docs.python.org/3/library/types.html#types.SimpleNamespace it appears that elements would be mutable, which should be avoided. The property should have strict read-only character, which may be a reason for using a tuple in the first place. I know, there are techniques to make class attributes read-only but that would complicate the minimalistic intention. Probably the best way is to specify that the function creates a new object every time. (It could actively set the values of a returned singleton to its supposed values. But then, if a user stores the returned object and modifies its attributes they might magically change back later, which can be a nasty side effect.) Or is there an elegant way to have SimpleNamespace
with immutable attributes?
Another advantage of tuple is that print(platform.vm_info)
would directly produce something useful. SimpleNamespace
appears to provide only a __repr__
implementation, so a proper __str__
would need to be added as well (AFAIK the default __str__
would not provide attribute values). That's certainly doable, but again makes it a little more complicated.
Can platform refer another implementation-depend either c-written or python-written module _platform
(or _python
, _variant
or any _something_good_name) and put everything CPython-specific literals in it? Then every other implementation only can write its own _platform
. Here are the places RustPython hard-code its name:
platform._sys_version()
https://github.com/python/cpython/blob/v3.12.0/Lib/platform.py#L1113site._getuserbase()
https://github.com/python/cpython/blob/v3.12.0/Lib/site.py#L278sysconfig._getuserbase()
https://github.com/python/cpython/blob/v3.12.0/Lib/sysconfig.py#L124sysconfig._INSTALL_SCHEMES
https://github.com/python/cpython/blob/v3.12.0/Lib/sysconfig.py#L156-L157test_cmd_line
https://github.com/python/cpython/blob/v3.12.0/Lib/test/test_cmd_line.py#L370Making interface for them will be easier once we gather those variants into a single spot.
That would demand a bigger change in CPython than was intended with this issue (would probably be hard to convince them for such a change). That said, providing a custom version of platform is what also Jython is doing. You may be right that the path to a proper implementation-independent interface would be to collect and consider all variants but that is well beyond the ambition of this feature request.
Btw, I added several skips on our side for test_cmd_line
in https://github.com/python/cpython/pull/116859
From the IronPython perspective, the .NET runtime doesn't really provide a great way to get the information required to populate this so I'm not sure if we'd be able to make use of it. Also trying hard to keep changes to the standard library to a minimum, so pulling the info from an implementation-dependant module would be preferable to modifying platform.py
.
pulling the info from an implementation-dependant module would be preferable to modifying platform.py.
This is just about interfacing. Adding a hook that alternative Python implementations can implement if it makes sense. Only a minimal, almost empty (that is, returning a dummy value) function header would be added, so users can access the info within official Python API. The main purpose would be to define the API and probably to host the API doc and spec. I never worked with IronPython, but I suppose it should not be too difficult to insert a workable custom implementation for that API then, or to monkeypatch the platform module accordingly.
the .NET runtime doesn't really provide a great way to get the information required to populate this
It surprises me that the .net runtime would not expose its version number and some info. It is not possible for a .net application to know (without dirty tricks, perhaps) whether it is running on Mono or Microsoft .net? What about the RuntimeInformation.FrameworkDescription Property? On stack overflow they say, it would also report e.g. "Mono [Version]". Is access to that API not feasible in IronPython?
Only a minimal, almost empty (that is, returning a dummy value) function header would be added, so users can access the info within official Python API.
Right, but implementation does matter (I'm not proposing either of these, they're just serving as examples). If this simply adds
def vm_info(vm_name='', vm_release='', vm_vendor=''):
return vm_name, vm_release, vm_vendor
to platform.py
then we have to modify platform.py
and ship our own (yes I know we already do, but if we didn't have to we wouldn't). Whereas if it were implemented as:
def vm_info(vm_name='', vm_release='', vm_vendor=''):
try:
from _platform import vm_info
return vm_info(vm_name=vm_name, vm_release=vm_release, vm_vendor=vm_vendor)
except ImportError:
return vm_name, vm_release, vm_vendor
then we could implement it without touching the Python part of standard library.
Side note, I'm not familiar with how this is used with Java, but what purpose do the function arguments serve? Why would you pass in your own values?
What about the RuntimeInformation.FrameworkDescription Property? Is access to that API not feasible in IronPython?
I am aware of that API and we expose it via clr.FrameworkDescription
. However, it is meant to be a diagnostic string and does not provide any guarantees as to its form so I'm not particularly interested in trying to parse it to split the runtime and version information.
Anyway, don't let IronPython hold you back, we're far enough behind that by the time we get to whatever version implements this .NET might have proper APIs. 😄
What do you think of this API:
def implementation_info():
try:
import _implementation_platform
except ImportError:
return None
else:
return _implementation_platform.implementation_info()
Design:
_implementation_platform
will be optional, you can have whatever you want in other implementations, while CPython won't have it at all not to create conflicts with other projects. We will just return None
(because we don't have anything that we don't already return in other places)_implementation_platform
and not just _platform
because we might want to create a speedup module called _platform
like we do with many other modules: _asyncio
, _sys
, etcname
, release
, vendor
read only propertiesimplementation_info
sounds better to me, because as I said vm
is too specific, some projects might not rely on a custom virtual machine@Stewori : Python maintains support for the possibility of multiple implementations through sys.implementation
. IMO, once one knows the implementation, one may find details specific to the implementation by an implementation-specific path. E.g knowing it is Jython one can use System.getProperty
. The set of properties is large and subtle.
>>> System.getProperty('java.version')
u'1.8.0_321'
>>> System.getProperty('java.specification.version')
u'1.8'
>>> System.getProperty('java.vm.specification.version')
u'1.8'
>>> System.getProperty('java.vm.version')
u'25.321-b07'
I think I am most likely to want the java.specification.version
, so that I can know what libraries to expect, and whether Jigsaw is in play, but what use case have you in mind?
Perhaps GraalVM cannot do exactly this, but that implementation could have its own access to the properties that applications need to know.
Edit: I think I am basically suggesting that there may not be a sufficiently uniform idea of the VM information, for the standard library to offer a uniform API to it.
@jeff5 By that logic, it would be enough to expose os.name
and once one knows the os one can use os-specific modules and measures to identify the info the platform module provides. No need for a platform module in the first place. I think the spirit of the platform module is to standardize platform info accross different platforms and implementations and some moderate info about a possibly underlying middleware would be a justified part of that.
In other words, the platform module was made for that kind of info, so for consistency everyone should actually put it there. This proposal is just one humble step towards a better standardization accross Python implementations.
there may not be a sufficiently uniform idea of the VM information
I thought that a name and version would not be asked too much and that it would be the minimal kind of information every framework would provide. (I added vendor mainly because it was in java_ver, it's probably not so important). Since this is apparently not even feasible on .net, what about a single info String?
The semantics of the info would be recommended as "[name] [version]" but that would not be a strict rule.
IronPython would provide whatever the content of the FrameworkDescription
property is, for Jython we would concat system properties we think are suitable and other implementations may follow this pattern to their liking. The doc in the dummy/interface implementation in CPython may feature an explicitly incomplete list of known example values.
Returning just a simple string would also eliminate the discussion and complexity of what container to return (tuple vs object with read-only properties).
Overall I intended this feature to be simple, so it would not introduce a maintenance burden. (I know, I initially argued against a string value, but that was before the discussion and feedback.)
@slozier If IronPython already exposes the info as clr.FrameworkDescription
, like you say, would it be so hard to support this feature by exposing it in the platform module? Given that you already ship a custom module (like Jython does). Could also be done as a monkey patch during startup, whatever works best.
@sobolevn, @youknowone
What info do you think would RustPython expose here that would not fit better into some other already existing platform property? E.g. into platform.python_compiler()
? The semantics of that property is to name the compiler that was used to build the currently running Python interpreter, e.g. on linux one gets "GCC [version]". The value "Rust [version]" would fit into that semantics for RustPython, so it appears to me. If there is a good reason not to place it there I am open to renaming this property, e.g. to framework_info
(I somehow find tooling_info
not a sufficient fit to refer to middleware).
Placing this draft for discussion:
def framework_info():
'''Returns a string describing a virtual machine, middleware or similar
kind of framework the current Python implementation is running on.
Since CPython is not running on any such framework, the reference
implementation just returns `None`. Alternative implementations may
expose a framework description via this method in a standardized API.
The recommended format of the string is "[name] [version]", wherein the
version part should not contain spaces. Other formats are not forbidden,
just discouraged. A good reason to diverge from the standard format is if
the middleware provides the required information in a cumbersome way
and overly complicated parsing would be required to adjust the format.
E.g. a browser-based Python implementation might most favorably provide
the original user agent string.
:returns:
Description, likely name and version, of a virtual machine, middleware
or similar kind of framework the current Python implementation is
running on, `None` for CPython.
:rtype: Optional[str]
Anticipated implementations:
:IronPython:
content of the property
`RuntimeInformation.FrameworkDescription`
:Jython, GraalPython:
`System.getProperty('java.vm.name') + " " +
System.getProperty('java.version')`
:Brython: user agent string `navigator.userAgent`
Anticipated example outputs:
:Java 21 on OpenJDK: `OpenJDK 64-Bit Server VM 21`
:Java 8 on OpenJDK: `OpenJDK 64-Bit Server VM 1.8.0_292`
:Java 8 on J9: `IBM J9 VM 1.8`
:.NET 3: `.NET Core 3.1.32`
:.NET 7: `.NET 7.0.12`
'''
try:
import _platform
except ImportError:
return None
else:
return _platform.framework_info()
By that logic, it would be enough to expose
os.name
and once one knows the os one can use os-specific modules and measures to identify the info the platform module provides. No need for a platform module in the first place.
I don't think this analogy holds because Python does not expose to us analogous things to those we are discussing, e.g. version when os.name
indicates Windows, since uname
is not guaranteed provided, according to the docs.
My instinct is for a tuple (pair), but show it used in a plausible application and it will be clearer why these items are the correct choice and the form.
What I'm saying is that the platform module provides easy and standardized access to platform and machine information, even things like platform.processor()
.
Python does not expose to us analogous things to those we are discussing, e.g. version when os.name indicates Windows, since uname is not guaranteed provided, according to the doc
I'm sure there are existing system library calls (e.g. via ctypes) via which one could get that information in platform-specific ways (platform module itself somehow gets the info). Of course that would be much more complicated and require system knowledge. So why should the users go through that hassle to get info about an underlying vm? Is that info less relevant? Then, what relevance does the processor string have. I suppose the use case is diagnostics but who knows? Apparently there was a use case to introduce java_ver
and framework_info
would just provide similar info (less in fact).
uname is not guaranteed provided
I think the same applies for various properties of the platform module.
Do you mean by "tuple" (name, version)
? That already seems to require parsing (and maybe even guesswork) for middleware that does not expose as plenty system properties as Java does.
Reading the platform doc, I notice that an empty string seems to be the preferred result if info is not available. So the above draft should probably return the empty string instead of None
.
What info do you think would RustPython expose here that would not fit better into some other already existing platform property? E.g. into platform.python_compiler()?
This seems better fit for compiler case. By looking the thread, implementations running on VM needs more information than compiled to native one.
Apparently there was a use case to introduce
java_ver
I found a few uses of java_ver
(on one of those scraping sites), but those were mostly (all?) scrapes of files now missing.
Do you mean by "tuple"
(name, version)
? That already seems to require parsing (and maybe even guesswork) for middleware that does not expose as plenty system properties as Java does.
That's right. Making it requires a split operation for IronPython. Making the string requires a join from a Java implementation. If the user always has to pick the string apart then they must understand to split on the rightmost space. My conjecture that a tuple would be more convenient stands or falls by how the result is to be used.
A tuple has been criticized earlier in this thread; a simple object with read-only attributes seems to be preferred for better backwards compatibility if more fields should be introduced. However, that seems overly complicated to me - the intention was simplistic.
they must understand to split on the rightmost space
I thought this field might also be populated by Python running in the browser (Brython, e.g.). I looked up browser/Jvascript/html5 API and it seems that the relevant info is only exposed as the user agent string. That string is a different kind of beast - it may also contain rendering engine/version, perhaps also system/version. There are some attempts to parse it but they are complicated and need adjustment every some years. So I thought it might be best to expose an info string unaltered if a middleware exposes only a single string. That seems to be a typical case (.net, browsers). Then users can make of it what they want. For cases like Java where there are plenty of info-properties, the convention "name version" is proposed. That combination seems to me most flexible for unknown further cases of middleware, so the API would not require adjustment in the future. Also, the idea of providing a simple string fits well with most functions in the platform module (IIRC), perhaps with the exception of uname.
Feature or enhancement
Proposal:
There exist alternative Python implementations that run on a virtual machine (vm) or a comparable middleware. The platform module currently lacks an implementation-independent API to retrieve (version-)information of an underlying vm. Examples are IronPython, Jython (3 possibly one day), RustPython. The proposal is to add a replacement for the recently deprecated function
platform.java_ver
under a generic nameplatform.vm_info
that can optionally be implemented by a Python implementation. The return value of such a function would be a tuple inspired by what used to be returned byplatform.java_ver
.The doc of that function states:
IMO only the
vm_info
part should be returned by the propsed functionplatform.vm_info
, hence the name. Os info should be obtainable from the os module, release should be obtainable similar to CPython's release. Forvendor
I do honestly not understand what the difference tovm_vendor
would supposed to be. As a consequence, I suggest the following definition:Apparently, this is plainly the old
java_ver
refactured to the relevant subset. This definition is merely intended as an entry point for discussion. E.g. I would be fine with a different naming etc. if as a result more use cases can be covered. E.g. I am not sure whether for RustPython the notion of a vm would be accurate, so a broader name may be suggested. Also the parametersvm_name
,vm_release
andvm_vendor
are placed here for discussion. For Java this makes sense because there exists e.g. Java implementations by Oracle and IBM (and many more in fact), which is relevant to know besides the release version number. I am rather confident about the idea that a tuple should be returned and that a plain version number would be an insufficiently narrow information. Perhaps even more fields should be defined, e.g. the build-type of the vm.As many maintainers of alternative Python implementations as possible should be noticed to take a look at this proposal to make sure it covers as many use cases as possible.
Note:
Given that with
java_ver
a special case of this proposal has already been part of Python STL for well over a decade, evidence demands the inevitable conclusion that a PEP would be an overkill for this proposal. Even if not, the discussion in this issue would be a necessary prerequisite for a PEP.Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
https://github.com/python/cpython/issues/116349