zjunlp / OntoProtein

[ICLR 2022] OntoProtein: Protein Pretraining With Gene Ontology Embedding
MIT License
138 stars 22 forks source link

Issue in creating an environment for OntoProtein pretraining #12

Closed amalislam675 closed 1 year ago

amalislam675 commented 1 year ago

Hello Researchers, I am finding the bugs in installing the deepspeed of version=0.5.1. I have already installed python 3.8.13, pytorch=1.12.0 with torch vision=0.13.0, torch audio=0.12.0, and cudatookit=11.3.1, tranformers=4.9.2, lmdb=1.3.0. But when I install the deepspeed=0.5.1. My all dependencies are not installed correctly for deepspeed. can you please tell the exact versions which you have used for pytorch, python, and deepspeed? Below is the error which I found: Traceback (most recent call last): File "", line 1, in File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/deepspeed/init.py", line 15, in from .runtime.engine import DeepSpeedEngine, DeepSpeedOptimizerCallable, DeepSpeedSchedulerCallable File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 20, in from tensorboardX import SummaryWriter File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/tensorboardX/init.py", line 5, in from .torchvis import TorchVis File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/tensorboardX/torchvis.py", line 11, in from .writer import SummaryWriter File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/tensorboardX/writer.py", line 15, in from .event_file_writer import EventFileWriter File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/tensorboardX/event_file_writer.py", line 28, in from .proto import event_pb2 File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/tensorboardX/proto/event_pb2.py", line 15, in from tensorboardX.proto import summary_pb2 as tensorboardX_dot_proto_dot_summarypb2 File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/tensorboardX/proto/summary_pb2.py", line 15, in from tensorboardX.proto import tensor_pb2 as tensorboardX_dot_proto_dot_tensorpb2 File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/tensorboardX/proto/tensor_pb2.py", line 15, in from tensorboardX.proto import resource_handle_pb2 as tensorboardX_dot_proto_dot_resourcehandlepb2 File "//mnt/user1/envs/pretraining/lib/python3.8/site-packages/tensorboardX/proto/resource_handle_pb2.py", line 35, in _descriptor.FieldDescriptor( File "//mnt/user1/.conda/envs/pretraining/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 560, in new _message.Message._CheckCalledFromGeneratedFile() TypeError: Descriptors cannot not be created directly. If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0. If you cannot immediately regenerate your protos, some other possible workarounds are:

  1. Downgrade the protobuf package to 3.20.x or lower.
  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower). Can you tell the exact versions which you have used?
Alexzhuan commented 1 year ago

Hi,

We have provided our experimental configuration in the README. (python3.8 / pytorch 1.9 / transformer 4.5.1+ / deepspeed 0.5.1)

As for the problem you're having, we have not had this problem before. Perhaps you can try the possible methods provided in the error message (Downgrade the protobuf package to 3.20.x or lower...).

amalislam675 commented 1 year ago

Thankyou, I will try it. I have read the README. Can you please tell me from Python 3.8 which of the Python 3.8.X version is used. From Pytorch 1.9, which of the Pytorch 1.9.X version is used. the same is for transfomer. I have installed Python 3.8.13, Pytorch 1.12.0, transformers 4.9.2 in new environment. Problem occurs in deep speed installation.

Alexzhuan commented 1 year ago

python==3.8.8 transformers==4.9.2 pytorch==1.9.0 deepspeed==0.5.1

amalislam675 commented 1 year ago

Thankyou

amalislam675 commented 1 year ago

The above mentioned solution is working fine for me. Thankyou