When I tried running the tests off the chi-2021-demo branch in a newly built image, I saw errors like this:
...
tests/test_extract_definitions.py:7: in <module>
from entities.definitions.commands.detect_definitions import (
entities/definitions/__init__.py:13: in <module>
from .commands.detect_definitions import DetectDefinitions
entities/definitions/commands/detect_definitions.py:16: in <module>
from ..nlp import DefinitionDetectionModel
entities/definitions/nlp.py:12: in <module>
from transformers import (CONFIG_MAPPING, AutoConfig, AutoTokenizer,
/usr/local/lib/python3.7/dist-packages/transformers/__init__.py:345: in <module>
from .trainer import Trainer, set_seed, torch_distributed_zero_first, EvalPrediction
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:64: in <module>
import wandb
/usr/local/lib/python3.7/dist-packages/wandb/__init__.py:37: in <module>
from wandb import sdk as wandb_sdk
/usr/local/lib/python3.7/dist-packages/wandb/sdk/__init__.py:12: in <module>
from .wandb_init import init # noqa: F401
/usr/local/lib/python3.7/dist-packages/wandb/sdk/wandb_init.py:28: in <module>
from .backend.backend import Backend
/usr/local/lib/python3.7/dist-packages/wandb/sdk/backend/backend.py:14: in <module>
from ..interface import interface
/usr/local/lib/python3.7/dist-packages/wandb/sdk/interface/interface.py:17: in <module>
from wandb.proto import wandb_internal_pb2 # type: ignore
/usr/local/lib/python3.7/dist-packages/wandb/proto/wandb_internal_pb2.py:37: in <module>
type=None),
/usr/local/lib/python3.7/dist-packages/google/protobuf/descriptor.py:755: in __new__
_message.Message._CheckCalledFromGeneratedFile()
E TypeError: Descriptors cannot not be created directly.
E If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
E If you cannot immediately regenerate your protos, some other possible workarounds are:
E 1. Downgrade the protobuf package to 3.20.x or lower.
E 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
E
E More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
...
I saw similar errors trying to run a paper through a newly built image:
...
Traceback (most recent call last):
File "scripts/run_pipeline.py", line 32, in <module>
from entities.definitions.commands.detect_definitions import DetectDefinitions
File "/data-processing/entities/definitions/__init__.py", line 13, in <module>
from .commands.detect_definitions import DetectDefinitions
File "/data-processing/entities/definitions/commands/detect_definitions.py", line 16, in <module>
from ..nlp import DefinitionDetectionModel
File "/data-processing/entities/definitions/nlp.py", line 12, in <module>
from transformers import (CONFIG_MAPPING, AutoConfig, AutoTokenizer,
File "/usr/local/lib/python3.7/dist-packages/transformers/__init__.py", line 345, in <module>
from .trainer import Trainer, set_seed, torch_distributed_zero_first, EvalPrediction
File "/usr/local/lib/python3.7/dist-packages/transformers/trainer.py", line 64, in <module>
import wandb
File "/usr/local/lib/python3.7/dist-packages/wandb/__init__.py", line 37, in <module>
from wandb import sdk as wandb_sdk
File "/usr/local/lib/python3.7/dist-packages/wandb/sdk/__init__.py", line 12, in <module>
from .wandb_init import init # noqa: F401
File "/usr/local/lib/python3.7/dist-packages/wandb/sdk/wandb_init.py", line 28, in <module>
from .backend.backend import Backend
File "/usr/local/lib/python3.7/dist-packages/wandb/sdk/backend/backend.py", line 14, in <module>
from ..interface import interface
File "/usr/local/lib/python3.7/dist-packages/wandb/sdk/interface/interface.py", line 17, in <module>
from wandb.proto import wandb_internal_pb2 # type: ignore
File "/usr/local/lib/python3.7/dist-packages/wandb/proto/wandb_internal_pb2.py", line 37, in <module>
type=None),
File "/usr/local/lib/python3.7/dist-packages/google/protobuf/descriptor.py", line 755, in __new__
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
...
Updating the version wandb is pinned to to 0.12.18 appears to fix things - I think that makes sense given the following line in their changelog:
Require protobuf<4 by @dmitryduev in https://github.com/wandb/client/pull/3709
The tests pass with this change, and I also ran a paper through an image built with this change, and the image we've currently got deployed, and there was no difference when I diffed the output files:
When I tried running the tests off the chi-2021-demo branch in a newly built image, I saw errors like this:
I saw similar errors trying to run a paper through a newly built image:
I think this is related to https://github.com/protocolbuffers/protobuf/issues/10051.
Updating the version wandb is pinned to to 0.12.18 appears to fix things - I think that makes sense given the following line in their changelog:
The tests pass with this change, and I also ran a paper through an image built with this change, and the image we've currently got deployed, and there was no difference when I diffed the output files: