zjunlp / OntoProtein

[ICLR 2022] OntoProtein: Protein Pretraining With Gene Ontology Embedding
MIT License
144 stars 22 forks source link

ImportError in deepspeed.py #22

Closed tomcobley closed 1 year ago

tomcobley commented 1 year ago

Hi,

Thanks for open-sourcing this really cool model.

I'm trying to play around with pretraining it myself, but I run into this ImportError when I run the run_pretrain.sh script. I would greatly appreciate any guidance, thanks!

(onto_env) tomcobley@compute-g-17-147:~/OntoProtein $ bash script/run_pretrain.sh 

[2022-11-26 21:06:55,053] [WARNING] [runner.py:179:fetch_hostfile] Unable to find hostfile, will proceed with training with loc
al resources only.
[2022-11-26 21:06:56,224] [INFO] [runner.py:508:main] cmd = /home/tomcobley/.conda/envs/onto_env/bin/python -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMF19 --master_addr=127.0.0.1 --master_port=29500 run_pretrain.py --do_train --output_dir data/output_data/filtered_ke_text --pretrain_data_dir path/todata/ProteinKG25 --protein_seq_data_file_name swiss_seq --in_memory true --max_protein_seq_length 1024 --model_protein_seq_data true --model_protein_go_data true --model_go_go_data true --use_desc true --max_text_seq_length 128 --dataloader_protein_go_num_workers 1 --dataloader_go_go_num_workers 1 --dataloader_protein_seq_num_workers 1 --num_protein_go_neg_sample 128 --num_go_go_neg_sample 128 --negative_sampling_fn simple_random --protein_go_sample_head false --protein_go_sample_tail true --go_go_sample_head true --go_go_sample_tail true --protein_model_file_name data/model_data/ProtBERT --text_model_file_name data/model_data/PubMedBERT --go_encoder_cls bert --protein_encoder_cls bert --ke_embedding_size 512 --double_entity_embedding_size false --max_steps 60000 --per_device_train_batch_size 4 --weight_decay 0.01 --optimize_memory true --gradient_accumulation_steps 256 --lr_scheduler_type linear --mlm_lambda 1.0 --lm_learning_rate 1e-5 --lm_warmup_steps 50000 --ke_warmup_steps 50000 --ke_lambda 1.0 --ke_learning_rate 2e-5 --ke_max_score 12.0 --ke_score_fn transE --ke_warmup_ratio --seed 2021 --deepspeed dp_config.json --fp16 --dataloader_pin_memory

Traceback (most recent call last):
  File "/home/tomcobley/.conda/envs/onto_env/lib/python3.8/runpy.py", line 185, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/home/tomcobley/.conda/envs/onto_env/lib/python3.8/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/home/tomcobley/OntoProtein/deepspeed.py", line 25, in <module>
    from .dependency_versions_check import dep_version_check
ImportError: attempted relative import with no known parent package
Alexzhuan commented 1 year ago

Hi,

You can remove the deepspeed.py file in the project, which is provided to keep the consistent version of deepspeed module in the transformers library with we have used (The deepspeed module in the latest version of transformers have changed).

tomcobley commented 1 year ago

Ok, thanks!