ray-project / xgboost_ray

Distributed XGBoost on Ray
Apache License 2.0
139 stars 34 forks source link

`TypeError: Descriptors cannot not be created directly` when importing xgboost_ray #220

Closed gcaria closed 2 years ago

gcaria commented 2 years ago

I am using the rayproject/ray-ml:nightly-py38-cpu image and I get:

> docker run -it ray-ml-new
(base) ray@44c9e3c64153:~$ python
Python 3.8.5 (default, Sep  4 2020, 07:30:14)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xgboost_ray
/home/ray/anaconda3/lib/python3.8/site-packages/xgboost/compat.py:31: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
  from pandas import MultiIndex, Int64Index
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ray/anaconda3/lib/python3.8/site-packages/xgboost_ray/__init__.py", line 1, in <module>
    from xgboost_ray.main import RayParams, train, predict
  File "/home/ray/anaconda3/lib/python3.8/site-packages/xgboost_ray/main.py", line 28, in <module>
    from xgboost_ray.callback import DistributedCallback, \
  File "/home/ray/anaconda3/lib/python3.8/site-packages/xgboost_ray/callback.py", line 6, in <module>
    from ray.util.annotations import PublicAPI, DeveloperAPI
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/__init__.py", line 115, in <module>
    import ray._raylet  # noqa: E402
  File "python/ray/_raylet.pyx", line 116, in init ray._raylet
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/exceptions.py", line 7, in <module>
    from ray.core.generated.common_pb2 import RayException, Language, PYTHON
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/core/generated/common_pb2.py", line 15, in <module>
    from . import runtime_env_common_pb2 as src_dot_ray_dot_protobuf_dot_runtime__env__common__pb2
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/core/generated/runtime_env_common_pb2.py", line 36, in <module>
    _descriptor.FieldDescriptor(
  File "/home/ray/anaconda3/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 560, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
Yard1 commented 2 years ago

It looks like the latest update to Protobuf (https://pypi.org/project/protobuf/4.21.0/) broke things in Ray. We are working on fixing that. In the meanwhile, can you try pip install "protobuf<4.21.0"?

gcaria commented 2 years ago

Thanks, that does fix the error (as PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python does) but mostly just wanted to signal the issue.