Open Flamefire opened 5 years ago
All numbers.Integral are accepted in python protobuf setters for int32 field: https://github.com/protocolbuffers/protobuf/blob/f50a1f843e8c2599a92e9d6bd5324462a5233367/python/google/protobuf/internal/type_checkers.py#L130
I am not sure why windows cause failures on windows, but I'd assume it was not from the set to a protobuf repeated field?
I am not sure why windows cause failures on windows, but I'd assume it was not from the set to a protobuf repeated field?
A numpy bool
is not a numbers.Integral
so the failure is actually the expected behaviour. The bug I'm reporting is that it does not fail in some circumstances (e.g. on everything but windows).
I traced this down to the fact that there are 2 different implementations of the RepeatedScalarContainer
: 1 in C++ and 1 in Python. The latter does check the type and hence (correctly) fails, but the C++ version does not and hence is wrong.
To reproduce simply download protobuf with pip on linux and assign a numpy bool array to a int32 repeated scalar field. This should work although it is wrong.
Yes we do have pure python and cpp extension. Looks like the issue is not for different platform, it is different protobuf implementations.
For cpp extension, we do have type checks as well: https://github.com/protocolbuffers/protobuf/blob/46a48e49aa8357bbeee8040819a35e59880e329a/python/google/protobuf/pyext/message.cc#L679
It was surprise me that a Python bools are a subclass of int but numpy bools are not subclass of number type.
My understand is that all types are acceptable if it can be safely convert to the expected type. For example a bool field can accept integral: https://github.com/protocolbuffers/protobuf/blob/46a48e49aa8357bbeee8040819a35e59880e329a/python/google/protobuf/internal/type_checkers.py#L239
Change the pure python to accept numpy bool make more sense to me
There was a discussion about this: https://github.com/numpy/numpy/issues/3580
Essentially:
bool
are integers where True==1
and False==0
-> True+True=2
bool_
are real bools, but they can be upcast to integer when combined in calculations. But: bool_(1) + bool_(1) == bool_(1)
(aka AND
not PLUS
)Note that there is also an example posted: Isinstance(np.int8(0), numbers.Integral) -> False
which does no longer hold (it returns True as it should)
Furthermore: a=np.bool_(5); b=a+0
-> a==True
&& b == 1
Given this I suggest to change the python checker to: isinstance(t, numbers.Integral) or isinstance(t+0, numbers.Integral)
where an exception should be caught and reported as False
. This would match the C code: Either it is an Integral or it can be cast/used as one.
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive
because the last activity was over 90 days ago.
Has this been adressed/fixed?
What version of protobuf and what language are you using? Version: /v3.6.1 Language: Python
What operating system (Linux, Windows, ...) and version? Linux Mint
What runtime / compiler are you using (e.g., python version or gcc version) Python 3.6.0
What did you do?
extend(np.array([True], dtype=np.bool))
on itWhat did you expect to see Error about type mismatch (bool is not int32) consistently
What did you see instead? No error on Linux with pip-installed versions Error when using Git-Version (installed with
pip -e
) or on windows.This came up during a PR for ONNX which uses protobuf (https://github.com/onnx/onnx/pull/1540). The same code causes a failure on Windows, but not on any Linux version tested. I checked out the Github version to be able to debug this and there it errored. Using VSCode I found that the
extend
method is a "native" method when it does not error and a python method when it does.TLDR: Sometimes the python
RepeatedScalarFieldContainer
is used where the type is checked on append (https://github.com/protocolbuffers/protobuf/blob/63d2f3bc80a870a09c6aea74025df40aad7dba04/python/google/protobuf/internal/containers.py#L255) and sometimes the native class is used which does not have this type check: https://github.com/protocolbuffers/protobuf/blob/63d2f3bc80a870a09c6aea74025df40aad7dba04/python/google/protobuf/pyext/repeated_scalar_container.cc#L344This also has the side effect, that the functions
extend
andinsert
are not provided in the native case.Question: When/How is decided what to use?
Bug: Why is the type not checked in the native case and an int32 happily accepts a numpy bool?