hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible
https://www.colossalai.org
Apache License 2.0
38.86k stars 4.35k forks source link

[BUG]: AttributeError: 'Parameter' object has no attribute 'colo_attr' #3739

Open lixinliu1995 opened 1 year ago

lixinliu1995 commented 1 year ago

πŸ› Describe the bug

when I use zero TensorShardStrategy() to offload parameter, with ZeroInitContext(), I initialize the mode, but AttributeError: 'Parameter' object has no attribute 'colo_attr'. the error information as follow. If you have any other questions , please let me know.

β”‚ /home/dell/liulixin/vist/visual-storytelling/train.py:75 in main β”‚ β”‚ β”‚ β”‚ 72 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ shard_strategy=gpc.config.zero.model_config.s β”‚ β”‚ 73 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ shard_param=True, β”‚ β”‚ 74 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ ): β”‚ β”‚ ❱ 75 β”‚ β”‚ model = Blip2ForVIST.from_pretrained('./blip2-vist-2') β”‚ β”‚ 76 β”‚ β”‚ #model = Blip2ForConditionalGeneration.from_pretrained('./blip β”‚ β”‚ 77 β”‚ β”‚ β”‚ 78 β”‚ optimizer = AdamW(filter(lambda p: p.requires_grad, model.paramete β”‚ β”‚ β”‚ β”‚ /home/dell/anaconda3/envs/vist/lib/python3.7/site-packages/transformers/mode β”‚ β”‚ ling_utils.py:2663 in from_pretrained β”‚ β”‚ β”‚ β”‚ 2660 β”‚ β”‚ β”‚ β”‚ offload_state_dict=offload_state_dict, β”‚ β”‚ 2661 β”‚ β”‚ β”‚ β”‚ dtype=torch_dtype, β”‚ β”‚ 2662 β”‚ β”‚ β”‚ β”‚ load_in_8bit=load_in_8bit, β”‚ β”‚ ❱ 2663 β”‚ β”‚ β”‚ β”‚ keep_in_fp32_modules=keep_in_fp32_modules, β”‚ β”‚ 2664 β”‚ β”‚ β”‚ ) β”‚ β”‚ 2665 β”‚ β”‚ β”‚ β”‚ 2666 β”‚ β”‚ model.is_loaded_in_8bit = load_in_8bit β”‚ β”‚ β”‚ β”‚ /home/dell/anaconda3/envs/vist/lib/python3.7/site-packages/transformers/mode β”‚ β”‚ ling_utils.py:2754 in _load_pretrained_model β”‚ β”‚ β”‚ β”‚ 2751 β”‚ β”‚ β”‚ β”‚ 2752 β”‚ β”‚ is_sharded_safetensors = is_safetensors and sharded_metadata β”‚ β”‚ 2753 β”‚ β”‚ # Retrieve missing & unexpected_keys β”‚ β”‚ ❱ 2754 β”‚ β”‚ model_state_dict = model.state_dict() β”‚ β”‚ 2755 β”‚ β”‚ expected_keys = list(model_state_dict.keys()) β”‚ β”‚ 2756 β”‚ β”‚ prefix = model.base_model_prefix β”‚ β”‚ 2757 β”‚ β”‚ β”‚ β”‚ /home/dell/anaconda3/envs/vist/lib/python3.7/site-packages/colossalai/zero/s β”‚ β”‚ harded_model/sharded_model_v2.py:443 in _colo_state_dict β”‚ β”‚ β”‚ β”‚ 440 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ process_group=None) -> 'OrderedDict[str, torc β”‚ β”‚ 441 β”‚ β”‚ if len(sharded_params) == 0: β”‚ β”‚ 442 β”‚ β”‚ β”‚ for param in self.parameters(): β”‚ β”‚ ❱ 443 β”‚ β”‚ β”‚ β”‚ if param.colo_attr.param_is_sharded: β”‚ β”‚ 444 β”‚ β”‚ β”‚ β”‚ β”‚ sharded_params.append(param) β”‚ β”‚ 445 β”‚ β”‚ if shard_strategy is not None: β”‚ β”‚ 446 β”‚ β”‚ β”‚ shard_strategy.gather([p.colo_attr.sharded_data_tensor for β”‚ ╰──────────────────────────────────────────────────────────────────────────────╯ AttributeError: 'Parameter' object has no attribute 'colo_attr'

Process finished with exit code 1

Environment

3090 CUDA 11.4 accelerate 0.18.0 aiohttp 3.8.4 aiosignal 1.3.1 anyio 3.6.2 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 async-timeout 4.0.2 asynctest 0.13.0 attrs 23.1.0 backcall 0.2.0 bcrypt 4.0.1 beautifulsoup4 4.12.2 bleach 6.0.0 certifi 2022.12.7 cffi 1.15.1 cfgv 3.3.1 charset-normalizer 3.1.0 click 8.1.3 colossalai 0.2.8 contexttimer 0.3.3 cryptography 40.0.2 datasets 2.11.0 debugpy 1.6.7 decorator 5.1.1 deepspeed 0.7.7 defusedxml 0.7.1 dill 0.3.6 distlib 0.3.6 entrypoints 0.4 evaluate 0.4.0 fabric 3.0.0 fastjsonschema 2.16.3 filelock 3.11.0 frozenlist 1.3.3 fsspec 2023.1.0 hjson 3.1.0 huggingface-hub 0.13.4 identify 2.5.22 idna 3.4 importlib-metadata 6.3.0 importlib-resources 5.12.0 invoke 2.0.0 ipykernel 6.16.2 ipython 7.34.0 ipython-genutils 0.2.0 ipywidgets 8.0.6 jedi 0.18.2 Jinja2 3.1.2 jsonschema 4.17.3 jupyter 1.0.0 jupyter_client 7.4.9 jupyter-console 6.6.3 jupyter_core 4.12.0 jupyter-server 1.24.0 jupyterlab-pygments 0.2.2 jupyterlab-widgets 3.0.7 markdown-it-py 2.2.0 MarkupSafe 2.1.2 matplotlib-inline 0.1.6 mdurl 0.1.2 mistune 2.0.5 multidict 6.0.4 multiprocess 0.70.14 nbclassic 0.5.5 nbclient 0.7.3 nbconvert 7.3.1 nbformat 5.8.0 nest-asyncio 1.5.6 ninja 1.11.1 nodeenv 1.7.0 notebook 6.5.4 notebook_shim 0.2.2 numpy 1.21.6 opencv-python 4.7.0.72 packaging 23.0 pandas 1.3.5 pandocfilters 1.5.0 paramiko 3.1.0 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.5.0 pip 22.3.1 pkgutil_resolve_name 1.3.10 platformdirs 3.2.0 pre-commit 2.21.0 prometheus-client 0.16.0 prompt-toolkit 3.0.38 psutil 5.9.4 ptyprocess 0.7.0 py-cpuinfo 9.0.0 pyarrow 11.0.0 pycparser 2.21 pydantic 1.10.7 Pygments 2.15.0 PyNaCl 1.5.0 pyrsistent 0.19.3 python-dateutil 2.8.2 pytz 2023.3 PyYAML 6.0 pyzmq 25.0.2 qtconsole 5.4.2 QtPy 2.3.1 regex 2022.10.31 requests 2.28.2 responses 0.18.0 rich 13.3.4 Send2Trash 1.8.0 setuptools 65.6.3 six 1.16.0 sniffio 1.3.0 soupsieve 2.4.1 terminado 0.17.1 tinycss2 1.2.1 tokenizers 0.13.3 torch 1.12.1+cu113 torchaudio 0.12.1+cu113 torchvision 0.13.1+cu113 tornado 6.2 tqdm 4.65.0 traitlets 5.9.0 transformers 4.27.4 typing_extensions 4.5.0 urllib3 1.26.15 virtualenv 20.21.0 wcwidth 0.2.6 webencodings 0.5.1 websocket-client 1.5.1 wheel 0.38.4 widgetsnbextension 4.0.7 xxhash 3.2.0 yarl 1.8.2 zipp 3.15.0

lixinliu1995 commented 1 year ago

my e-mail is 3328694288@qq.com