python / typeshed

Collection of library stubs for Python, with static types
Other
4.37k stars 1.75k forks source link

Use `Literal[0, 1, 2]` for seek's "whence" parameter #11421

Open srittau opened 8 months ago

srittau commented 8 months ago

This will affect quite a few classes and will possibly have a fairly significant primer impact, but we should at least work towards this goal.

max-muoto commented 3 months ago

@srittau PR experimenting with this https://github.com/python/typeshed/pull/12335. Pandas seems to be of the libraries with the most issues, so perhaps we could help out with updating their external stubs/inline types, we would of course have to ship this first though.

srittau commented 3 months ago

@max-muoto Thanks for doing this! Maybe it's easier to start smaller. E.g. start with a PR that changes only protocols or a PR that only changes stdlib/io.pyi. That said, overall this looks quite promising.

picnixz commented 3 months ago

Strictly speaking, C11 does not ask for SEEK_SET, SEEK_CUR and SEEK_END to be 0, 1, and 2 respectively. They can be any distinct values (though I don't know of any C11 implementation that changes those values) [also, good to perhaps change those values in CPython, I'll work on that]. Ideally, a LiteralInt type would have been ideal to indicate an integer but platform-dependent constant... but I don't think we have this (I'll also ask on discourse if it's relevant for the typing community for typing constants that we don't know about).

By the way, there is currently no plan for supporting SEEK_DATA and SEEK_HOLE although (custom) Buffered IO objects might provide an implementation for this whence parameter (otherwise, it falls back to the default implementation which AFAIK raises if you are not using SEEK_SET, SEEK_CUR, or SEEK_END).

Akuli commented 3 months ago

C's SEEK_SET, SEEK_CUR and SEEK_END values are not relevant. Python implements its own IO using C functions that are one level lower than C stdio (e.g. python uses open() instead of fopen() on posix).

For example, if you run gdb --args python3 -c 'open("foo.txt", "w")' and set a breakpoint on fopen, you will see that fopen() is called for a few other files but not foo.txt. But if you instead set a breakpoint on open(), you will see how it opens foo.txt.

picnixz commented 3 months ago

Actually, yes and no. On my machine for instance, I can use different values for seek when using io.FileIO for instance. The reason is that lseek() is directly called: https://github.com/python/cpython/blob/bc93923a2dee00751e44da58b6967c63e3f5c392/Modules/_io/fileio.c#L1003-L1011 and https://github.com/python/cpython/blob/bc93923a2dee00751e44da58b6967c63e3f5c392/Modules/_io/fileio.c#L923.

Some IO objects will complain when you pass another kind of whence value, but not all of them. Those that complain are IIRC BytesIO and TextIOWrapper (the latter being returned when you do open()).