capi-workgroup / decisions

Discussion and voting on specific issues
8 stars 1 forks source link

Py_PACK_VERSION #47

Open encukou opened 5 days ago

encukou commented 5 days ago

Please vote for adding macros/functions for easier version handling, as discussed on Discourse:

These should go in the limited API.

See the discussion thread for other considered ideas:

vstinner commented 5 days ago

Is it really useful to add Py_PACK_FULL_VERSION(x, y, z, level, serial)? What's the use case?

(it's useful to explain this in docs, and sometimes in projects like Cython)

Cython can use its own macro.

zooba commented 5 days ago

Is it really useful to add Py_PACK_FULL_VERSION(x, y, z, level, serial)? What's the use case?

I see it more as fully defining our version field. We shouldn't make other people replicate our logic for that if we're going to provide any level of support - I'd argue that Py_PACK_VERSION is the more questionable one (especially if macro arguments were properly optional, which they aren't...)

serhiy-storchaka commented 5 days ago

I support adding the macros, but I am against adding functions. The main use of such macros is in preprocessor:

#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= Py_PACK_FULL_VERSION(3, 14, 0, 0xA, 1)

And it only works if they are macros.

Wrappers for non-C languages should implement them as native functions. This is not more difficult than create a wrapper.

zooba commented 5 days ago

Right, but C code is never going to get the function, they'll always get the macro. They can be exported for GetProcAddress and (I assume) dlsym purposes without being in the header files.

It's not trivial for non-C languages to write a function that wraps the macros. That involves adding a C compiler or copy-pasting the logic.

serhiy-storchaka commented 5 days ago
def Py_PACK_FULL_VERSION(x, y, z, level, serial):
    return (x << 24) | (y << 16) | (z <<  8) | (level <<  4) | (serial << 0)

I am sure that it is easy to write in any language.

zooba commented 5 days ago

Sure, but it's not easy to infer without being given that information. So either we present it in the docs as "here's how to do this yourself", or we can simply provide a function that does the same job as the macro (which we've agreed elsewhere is a good policy to have for stability).

(There are probably also languages where it's not easy - very strictly typed languages may not let you shift a byte outside of its range without explicit casts/conversions - but this is very much speculative and besides the point.)

serhiy-storchaka commented 5 days ago

It is already represented in details in the documentation for PY_VERSION_HEX. The Py_PACK_FULL_VERSION should refer to it, or maybe the description should be moved from PY_VERSION_HEX to Py_PACK_FULL_VERSION.

If add function, we should discuss them in details. What types of parameters -- int, int32_t, uint32_t, other? What is the returning type? How to handle errors? What exceptions can be raised?

I do not believe that functions are needed. Macros will be helpful. If authors of wrappers for non-C languages ask us to add functions, then we can add functions that satisfy their needs.

vstinner commented 5 days ago

I would prefer to only provide macros. I don't think that it's useful to add functions here.

encukou commented 2 days ago

This goes in the limited API, where, IMO, we should add function equivalents to macros whenever it's possible. There's no pressing reason to omit the functions here.

Wrappers for non-C languages should implement them as native functions. This is not more difficult than create a wrapper.

That's only true if you're writing the wrapper by hand. If you're generating wrappers from headers (which some projects do, and which I'd like to make easier), functions should make it much easier.


As for Py_PACK_FULL_VERSION being easy to write -- yes, this proposal is all about convenience. Writing out bit-shifts isn't hard, but annoying (and the same goes for the tests).

I'd rather add masking (which is no-op in the primary use case of using a macros on literals):

def Py_PACK_FULL_VERSION(x, y, z, level, serial):
    return (
        (x & 0xff) << 24)
        | ((y & 0xff) << 16)
        | ((z & 0xff) <<  8)
        | ((level & 0x0f) << 4)
        | ((serial & 0x0f) << 0)
    )

Thanks for asking about the signatures. My proposal is to match the inputs' domains, except uint8_t for the half-bytes:

Out-of range level and serial would be masked with & 0x0f. The function cannot fail; it is defined for all inputs. (This means that if we'd ever need to change their behaviour, we'd need to add new functions instead -- but we'd need that anyway for limited-API macros.)

zooba commented 2 days ago

Thanks for asking about the signatures. My proposal is to match the inputs' domains, except uint8_t for the half-bytes

Might as well just go int and mask - a good calling convention is going to use registers, and a bad one is going to push machine words onto a stack, so passing bytes is unlikely to save anything at all. Plus if one day the full-year CalVer idea goes through then uint8_t isn't going to be sufficient (but so many copy-pasted macros would be broken that I expect we'd only pretend that Python is using CalVer anyway, rather than actually switching 😉 )

encukou commented 2 days ago

OK, let's go with int input.

vstinner commented 1 day ago

If you don't want uint8_t, I would prefer unsigned int instead of int.

serhiy-storchaka commented 1 day ago

Let's split the voting separately for macros and functions.

There is a global variable Py_Version as a runtime version of PY_VERSION_HEX. It has type unsigned long. So, for consistency, new functions should also return unsigned long. Their names should follow common pattern for functions: Py_PackFullVersion() and Py_PackVersion().

encukou commented 1 day ago

Let's split the voting separately for macros and functions.

If we did that, I couldn't express my preference to add both at the same time (following a general guideline discussed in https://github.com/capi-workgroup/api-evolution/issues/18).

There is a global variable Py_Version as a runtime version of PY_VERSION_HEX. It has type unsigned long. So, for consistency, new functions should also return unsigned long.

New API should use the fixed-width types, per discussion here. I don't see how the difference would matter in practice.

Their names should follow common pattern for functions: Py_PackFullVersion() and Py_PackVersion().

IMO, functions and function-like macros that do the same thing should have the same name.

encukou commented 1 day ago

If you don't want uint8_t, I would prefer unsigned int instead of int.

Why? Both int types have many values that are invalid uint_8; interpretation of the sign bit doesn't matter.

vstinner commented 1 day ago

Using unsigned int allows the compiler to detect that you pass an invalid value. For me, it makes no sense to use a negative value to build a version number.

Example:

int f(unsigned int arg) { return 0; }
int main() { return f(-3); }

Warning when using -Wconversion:

$ gcc int.c -o int -Wconversion

int.c: In function 'main':
int.c:2:23: warning: unsigned conversion from 'int' to 'unsigned int' changes value from '-3' to '4294967293' [-Wsign-conversion]
    2 | int main() { return f(-3); }
zooba commented 1 day ago

There's a level of "someone doesn't know how to use the API" that we can't help any further.

On one hard, we have the argument that everyone should use the macro (which can't really handle error checking like this). On the other, we have the argument that the function version should have excessive amounts of error checking.

This is a simple enough API that I think we can allow garbage-in-garbage-out. No need to overthink it.

encukou commented 1 day ago

Coincidentally: with masking, -1 is a convenient way to spell “maximum”.