python-qt-tools / PyQt5-stubs

Stubs for PyQt5
GNU General Public License v3.0
66 stars 31 forks source link

Fix QByteArray to bytes conversion #143

Closed bluebird75 closed 3 years ago

bluebird75 commented 3 years ago

Fix QByteArray failing to convert to bytes

BryceBeagle commented 3 years ago

Looks like these changes cause the tests to fail in CI.

Also can you add an entry to the CHANGELOG.md? We should probably document PR workflow somewhere...

bluebird75 commented 3 years ago

The error reported by the CI is :

error: PyQt5.QtCore.QByteArray.__bytes__ is not present at runtime
Stub: at line 3749
def (self: PyQt5.QtCore.QByteArray) -> builtins.bytes
Runtime:
MISSING

Indeed, the method QByteArray.__bytes__() does not exist. The question is however, how does PyQt make bytes( QByteArray ) work then ? How to make mypy aware of it ?

BryceBeagle commented 3 years ago

I don't have a good answer, but is it possible that PyQt does something really bizarre and makes just QByteArray().__bytes__ exist? (i.e. not on the class but only on instances)

bluebird75 commented 3 years ago

It does not even exist on the instance. I suppose that it uses a Python C API trick to trigger the bytes conversion. See example :

>>> from PyQt5.QtCore import QByteArray
>>> QByteArray.__bytes__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'QByteArray' has no attribute '__bytes__'
>>> ba = QByteArray(1, '3')
>>> dir(ba)
['Base64Encoding', 'Base64Option', 'Base64Options', 'Base64UrlEncoding', 'KeepTrailingEquals', 'OmitTrailingEquals', '__add__', '__class__', '__contains__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__',
'__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__le__', '__len__', '__lt__', '__module__', '__mul__', '__ne__', '__new__', '__radd__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'append', 'at', 'capacity', 'chop', 'chopped', 'clear', 'compare', 'contains', 'count', 'data', 'endsWith', 'fill', 'fromBase64', 'fromHex', 'fromPercentEncoding', 'fromRawData', 'indexOf', 'insert', 'isEmpty', 'isLower', 'isNull', 'isUpper', 'lastIndexOf', 'left', 'leftJustified', 'length', 'mid', 'number', 'prepend', 'push_back', 'push_front', 'remove', 'repeated', 'replace', 'reserve', 'resize', 'right', 'rightJustified', 'setNum', 'simplified', 'size', 'split', 'squeeze', 'startsWith', 'swap', 'toBase64', 'toDouble', 'toFloat', 'toHex', 'toInt', 'toLong', 'toLongLong', 'toLower', 'toPercentEncoding', 'toShort', 'toUInt', 'toULong', 'toULongLong', 'toUShort', 'toUpper', 'trimmed', 'truncate']
>>> ba.__bytes__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'QByteArray' object has no attribute '__bytes__'
>>> bytes(ba)
b'3'

So, this CI test will never succeed. Is there a way to bypass this check ?

bluebird75 commented 3 years ago

The CI complains about method not present in the object, but the equivalent python operation actually works. So, what's correct ?

BryceBeagle commented 3 years ago

If we can't find a solution, I think we can add # type: ignore comment to the line in the stub so that stubtest doesn't complain about it anymore.

This may be a question for the python/typing gitter/matrix

altendky commented 3 years ago

https://gitter.im/python/typing?at=605e63159ebdfd16409efe93

I think that https://github.com/python/typeshed is missing a bytes.__new__() accepting a thing with __getitem__(self, item: int) -> int: (or whatever fancier version of that it should be). bytes() works with a class with just that and QByteArray has that.

https://github.com/python/typeshed/blob/839d711c6fd60ad17346dac6c24df5a3c65efd80/stdlib/builtins.pyi#L410-L420

class bytes(ByteString):
    @overload
    def __new__(cls: Type[_T], ints: Iterable[int]) -> _T: ...
    @overload
    def __new__(cls: Type[_T], string: str, encoding: str, errors: str = ...) -> _T: ...
    @overload
    def __new__(cls: Type[_T], length: int) -> _T: ...
    @overload
    def __new__(cls: Type[_T]) -> _T: ...
    @overload
    def __new__(cls: Type[_T], o: SupportsBytes) -> _T: ...
altendky commented 3 years ago

Hum... So this is kinda nice.

https://mypy-play.net/?mypy=latest&python=3.9&gist=974b35693fd30dcfb0c655ed9d4d847e

from collections.abc import ByteString
from typing import Iterable, Generic, overload, Protocol, Type, TypeVar

class E:
    def __getitem__(self, item) -> int:
        pass

_T = TypeVar("_T")
_T_co = TypeVar("_T_co", covariant=True)
_T_contra = TypeVar("_T_contra", contravariant=True)

class _SupportsGetItem(Protocol, Generic[_T_contra, _T_co]):
    def __getitem__(self, item: _T_contra) -> _T_co: ...

class MyBytes(ByteString):
    @overload
    def __new__(cls: Type[_T], ints: Iterable[int]) -> _T: ...
    @overload
    def __new__(cls: Type[_T], ints: _SupportsGetItem[object, int]) -> _T: ...
    @overload
    def __new__(cls: Type[_T], string: str, encoding: str, errors: str = ...) -> _T: ...
    @overload
    def __new__(cls: Type[_T], length: int) -> _T: ...
    @overload
    def __new__(cls: Type[_T]) -> _T: ...
    # @overload
    # def __new__(cls: Type[_T], o: SupportsBytes) -> _T: ...
    def __new__(cls, *args, **kwargs):
        return 0

    def __getitem__(self, item): ...
    def __len__(self, item): ...

e = E()
MyBytes(e)

But not quite since indexing a QByteArray gives you not an int but a bytes.

>>> b = QtCore.QByteArray(b'12')
>>> b
PyQt5.QtCore.QByteArray(b'12')
>>> b[0]
b'1'

And of course bytes() doesn't like that.

https://replit.com/@altendky/PuzzlingGigaCustomers-1#main.py

class C:
    def __getitem__(self, item):
        if item == 0:
            return b'3'

        raise IndexError()

c = C()
b = bytes(c)
print(b)
Traceback (most recent call last):
  File "main.py", line 9, in <module>
    b = bytes(c)
TypeError: 'bytes' object cannot be interpreted as an integer

That makes it seem like this isn't actually the means by which bytes() is using QByteArray.

Maybe https://github.com/python/typeshed/commit/2cb4a2aa54b459137231514102a4fa2d94fdb35d is still a good change but I'm not convinced it is the proper one for us.

altendky commented 3 years ago

Maybe the buffer C-api interface?

https://github.com/python/cpython/blob/027b6699276ed8d9f68425697ac89bf61e54e6b9/Objects/bytesobject.c#L2832-L2834

/* Use the modern buffer interface */
if (PyObject_CheckBuffer(x))
    return _PyBytes_FromBuffer(x);

I think this is a hint that that may be true.

Python 3.9.1 (default, Jan  6 2021, 10:22:00) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from PyQt5 import QtCore
>>> application = QtCore.QCoreApplication([])
>>> qba = QtCore.QByteArray(b'\x01\x02')
>>> qba
PyQt5.QtCore.QByteArray(b'\x01\x02')
>>> memoryview(qba)
<memory at 0x7ff838f9d400>

https://bugs.python.org/issue27501 https://github.com/python/typing/issues/593

Maybe I'll look those over tomorrow to see if there are any good workarounds.

altendky commented 3 years ago

I didn't see anything as I read over the links. As is the hints are incorrect due to rejecting bytes(QByteArray()). The could also be incorrect by expressing the existence of .__bytes__() as you have done and ignoring the error in stubgen. There would also be a big note in the stubs linking here and also with a new issue to correct it I guess.

I don't like being wrong nor lying... I'm not sure which is worse. Can anyone think of significant scenarios where incorrectly hinting .__bytes__() would cause trouble? The obvious one is if someone literally calls .__bytes__() directly. Anything else?

BryceBeagle commented 3 years ago

Can anyone think of significant scenarios where incorrectly hinting .bytes() would cause trouble? The obvious one is if someone literally calls .bytes() directly.

The class would also incorrectly satisfy a SupportsBytes Protocol. This is probably fine in most cases, as I'd presume the application thereof would typically use bytes(), but the docs definitely refer directly to __bytes__' existence.

bluebird75 commented 3 years ago

Phil Thompson confirmed (in case there was any doubt left) that the conversion is using the buffer protocol : http://python.6.x6.nabble.com/How-does-QByteArray-convert-to-bytes-td5289854.html

I'll go ahead and fix the CI properly.

bluebird75 commented 3 years ago

This is probably ready to be merged.

altendky commented 3 years ago

Thanks again for the contribution and your patience. I hope that in the future I can enable a quicker workflow.

bluebird75 commented 3 years ago

Thanks for merging this.