capi-workgroup / decisions

Discussion and voting on specific issues
4 stars 1 forks source link

Add PyConfig_Get() and PyConfig_GetInt() functions #3

Closed vstinner closed 6 months ago

vstinner commented 6 months ago

PR: https://github.com/python/cpython/pull/112609

PyObject* PyConfig_Get(const char *name)

int PyConfig_GetInt(const char *name, int *value)

This PR doesn't add these functions to the limited C API.


Voters:

Since I proposed this API and I'm part of the C API Working Group, I will not vote on this decision.


I consider adding these API to the limited C API later, once the PyInitConfig API will added (but this issue is not about PyInitConfig).

These API cover multiple use cases:

There is no direct replacement of global configuration variable such as Py_VerboseFlag. Calling PyConfig_GetInt("verbose", &verbose) requires the caller to handle exceptions.

PyConfig_GetInt("write_bytecode", &value) (replacing Py_DontWriteBytecodeFlag) gets sys.dont_write_bytecode. For example, the function fails if the sys.dont_write_bytecode attribute has been removed. The PyConfig_GetInt() caller is free to ignore the exception, log the exception, or pass the exception to its own caller.

Discussion: https://discuss.python.org/t/fr-allow-private-runtime-config-to-enable-extending-without-breaking-the-pyconfig-abi/18004 The use case evolved since August 2022, now an API is also needed for the limited C API.

zooba commented 6 months ago

I've already reviewed and had my feedback accounted for, so I'm happy with this as-is.

String constants are about the best of a bad lot here. They're extensible into the future, and safe to use downlevel (on older versions that don't know about them). About the only possibility here would be to define macros with them, but since we don't support compiling for earlier versions from later versions people wouldn't be able to use the header containing newer names anyway, they'd need to be documented literals anyway.

encukou commented 6 months ago

I don't feel comfortable looking at this in isolation while the PyInitConfig_Set* API is still being worked out. Specifically, I don't have a good mental model of the intended relationship between PyConfig, PyInitConfig, and places where PyConfig_Get reads from (sometimes from sys, sometimes the internal PyConfig).

Specific issues I see with this API:

zooba commented 6 months ago

A better name might be PySys_GetConfig (and maybe add Value or Setting on the end), to avoid confusion with the configuration API (even though that's currently where people seem to be looking to find this information... maybe we can reference this from that section in the docs?)

Possibly we could raise NotImplementedError or LookupError for unknown names instead?

Other than our own tests, I'm not sure what the value of an API to return known names would be? The point of an API like PyConfig_Get is EAFP, and so the list of names should be separated from the runtime. And there isn't really a difference in reaction to an unknown name vs. a config value that is nonsense - either it's fatal to the app and it aborts, or it has a fallback value/behaviour that it uses instead. What's the scenario that benefits from being able to eagerly request the list of known names?

vstinner commented 6 months ago

A better name might be PySys_GetConfig (and maybe add Value or Setting on the end), to avoid confusion with the configuration API

Currently, PySys functions are C functions to access the sys module. PyConfig_Get() reads most values from PyInterpreterState.config, only a minority is read from the sys module.

Possibly we could raise NotImplementedError or LookupError for unknown names instead?

Ah sure, the exception type can be modified. For example, KeyError can be raised if the config option name is not known.

zooba commented 6 months ago

Currently, PySys functions are C functions to access the sys module.

Yes, I know. But from Python code, everything that you might read with this API is coming from the sys module, so arguably we should make all the rest available from the sys module as well and then the C API might as well be getting them from the sys module.

It's a little white lie about where the active configuration is stored that helps keep the mental model consistent between Python code and C code. And from an API design perspective, the mental model is far more important than the actual implementation (I mean it's better to lie if it makes things make more sense to users, especially if it makes things more consistent and one day we might change the implementation to be that way anyway).

vstinner commented 6 months ago

@encukou:

I don't feel comfortable looking at this in isolation while the PyInitConfig_Set* API is still being worked out. Specifically, I don't have a good mental model of the intended relationship between PyConfig, PyInitConfig, and places where PyConfig_Get reads from (sometimes from sys, sometimes the internal PyConfig).

PyInitConfig "Set" API is the PR: https://github.com/python/cpython/pull/110176 You can see the currently proposed API there.

I would prefer to work first on the "Get" API since it's simpler, and only later work on the "Set" API.

It's named PyConfig_*, but it doesn't work with the PyConfig struct.

It's named PyConfig_Get() since names are PyConfig member names.

If you dislike it, can you propose another name?

If we later do decide to add get API for the PyConfig struct, the naming will be awkward.

PyConfig_Get() "more or less" reads PyConfig members. The implementation gets sys attribute, since it's what users expect. If Python initialization was configured with value A but then the option value is overriden with B, user expects B (read from sys), not A (read from PyConfig).

If caller early, before the sys module exists, PyConfig members are read. Once Python is fully initialized, sys atributes are read, or PyConfig members are read, depending on the option name.

vstinner commented 6 months ago

Hi! How can I move this discussion forward? Do you need something?

zooba commented 6 months ago

I know this is a dimension we haven't discussed yet, but I wonder if adding an argument to get the config from would help the API make more sense?

I'm thinking an interpreter state or a thread state, which is obviously what it's using already, but if it's explicit then we don't even have to answer it. PyConfig_Get(tstate, ... is obviously getting it from the specified thread, and it doesn't matter whether it's coming from sys or a config struct.

The challenge with that is what to do if we need to lock the GIL, and I suspect the answer is that we fail if the tstate isn't current, and then if we figure it out one day (nogil?) then we can relax the constraint.

zooba commented 6 months ago

And then in that case, maybe PyThreadState_GetConfig naming makes more sense, which means PyConfig_Get remains available for a specific config object/struct, which would satisfy one of Petr's concerns.

encukou commented 6 months ago

Hi, FYI, I will be mostly offline from Friday until the end of the year.

@zooba

Other than our own tests, I'm not sure what the value of an API to return known names would be?

Generating a config dump for an error report, for example.

@vstinner

I would prefer to work first on the "Get" API since it's simpler, and only later work on the "Set" API.

I get that. But I would like to have a good mental model of what the intended solution is. As proposed, I find the the API surprising and incomplete, and I find it hard to the reasoning behind it. As I said before, I think this needs a PEP. I can work my way through the details of the proposal, and I'll have time to do it. But a PEP could make it easier for everyone else to understand, too.

encukou commented 6 months ago

Hm, the reason we're adding a new mapping API, and not using the existing PyObject_GetItem (or PyObject_GetAttr) seems to be that the “thing” that contains these values doesn't exist as a Python object. Is that right?

vstinner commented 6 months ago

I expected such API (two functions) to be simple to agree on, but apparently, it's more completed than expected. @encukou now suggests writing a PEP for adding these two functions. I'm already struggling to get a consensus on my PEP 737 – Unify type name formatting. I don't have the bandwidth to fight for too many APIs in parallel. So I prefer to close this issue for now.

vstinner commented 5 months ago

Follow-up: PEP 741: Python Configuration C API.