when I use zero TensorShardStrategy() to offload parameter, with ZeroInitContext(), I initialize the mode, but AttributeError: 'Parameter' object has no attribute 'colo_attr'. the error information as follow. If you have any other questions , please let me know.
π Describe the bug
when I use zero TensorShardStrategy() to offload parameter, with ZeroInitContext(), I initialize the mode, but AttributeError: 'Parameter' object has no attribute 'colo_attr'. the error information as follow. If you have any other questions , please let me know.
β /home/dell/liulixin/vist/visual-storytelling/train.py:75 in main β β β β 72 β β β β β β shard_strategy=gpc.config.zero.model_config.s β β 73 β β β β β β shard_param=True, β β 74 β β β β β β ): β β β± 75 β β model = Blip2ForVIST.from_pretrained('./blip2-vist-2') β β 76 β β #model = Blip2ForConditionalGeneration.from_pretrained('./blip β β 77 β β β 78 β optimizer = AdamW(filter(lambda p: p.requires_grad, model.paramete β β β β /home/dell/anaconda3/envs/vist/lib/python3.7/site-packages/transformers/mode β β ling_utils.py:2663 in from_pretrained β β β β 2660 β β β β offload_state_dict=offload_state_dict, β β 2661 β β β β dtype=torch_dtype, β β 2662 β β β β load_in_8bit=load_in_8bit, β β β± 2663 β β β β keep_in_fp32_modules=keep_in_fp32_modules, β β 2664 β β β ) β β 2665 β β β β 2666 β β model.is_loaded_in_8bit = load_in_8bit β β β β /home/dell/anaconda3/envs/vist/lib/python3.7/site-packages/transformers/mode β β ling_utils.py:2754 in _load_pretrained_model β β β β 2751 β β β β 2752 β β is_sharded_safetensors = is_safetensors and sharded_metadata β β 2753 β β # Retrieve missing & unexpected_keys β β β± 2754 β β model_state_dict = model.state_dict() β β 2755 β β expected_keys = list(model_state_dict.keys()) β β 2756 β β prefix = model.base_model_prefix β β 2757 β β β β /home/dell/anaconda3/envs/vist/lib/python3.7/site-packages/colossalai/zero/s β β harded_model/sharded_model_v2.py:443 in _colo_state_dict β β β β 440 β β β β β β process_group=None) -> 'OrderedDict[str, torc β β 441 β β if len(sharded_params) == 0: β β 442 β β β for param in self.parameters(): β β β± 443 β β β β if param.colo_attr.param_is_sharded: β β 444 β β β β β sharded_params.append(param) β β 445 β β if shard_strategy is not None: β β 446 β β β shard_strategy.gather([p.colo_attr.sharded_data_tensor for β β°βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ― AttributeError: 'Parameter' object has no attribute 'colo_attr'
Process finished with exit code 1
Environment
3090 CUDA 11.4 accelerate 0.18.0 aiohttp 3.8.4 aiosignal 1.3.1 anyio 3.6.2 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 async-timeout 4.0.2 asynctest 0.13.0 attrs 23.1.0 backcall 0.2.0 bcrypt 4.0.1 beautifulsoup4 4.12.2 bleach 6.0.0 certifi 2022.12.7 cffi 1.15.1 cfgv 3.3.1 charset-normalizer 3.1.0 click 8.1.3 colossalai 0.2.8 contexttimer 0.3.3 cryptography 40.0.2 datasets 2.11.0 debugpy 1.6.7 decorator 5.1.1 deepspeed 0.7.7 defusedxml 0.7.1 dill 0.3.6 distlib 0.3.6 entrypoints 0.4 evaluate 0.4.0 fabric 3.0.0 fastjsonschema 2.16.3 filelock 3.11.0 frozenlist 1.3.3 fsspec 2023.1.0 hjson 3.1.0 huggingface-hub 0.13.4 identify 2.5.22 idna 3.4 importlib-metadata 6.3.0 importlib-resources 5.12.0 invoke 2.0.0 ipykernel 6.16.2 ipython 7.34.0 ipython-genutils 0.2.0 ipywidgets 8.0.6 jedi 0.18.2 Jinja2 3.1.2 jsonschema 4.17.3 jupyter 1.0.0 jupyter_client 7.4.9 jupyter-console 6.6.3 jupyter_core 4.12.0 jupyter-server 1.24.0 jupyterlab-pygments 0.2.2 jupyterlab-widgets 3.0.7 markdown-it-py 2.2.0 MarkupSafe 2.1.2 matplotlib-inline 0.1.6 mdurl 0.1.2 mistune 2.0.5 multidict 6.0.4 multiprocess 0.70.14 nbclassic 0.5.5 nbclient 0.7.3 nbconvert 7.3.1 nbformat 5.8.0 nest-asyncio 1.5.6 ninja 1.11.1 nodeenv 1.7.0 notebook 6.5.4 notebook_shim 0.2.2 numpy 1.21.6 opencv-python 4.7.0.72 packaging 23.0 pandas 1.3.5 pandocfilters 1.5.0 paramiko 3.1.0 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.5.0 pip 22.3.1 pkgutil_resolve_name 1.3.10 platformdirs 3.2.0 pre-commit 2.21.0 prometheus-client 0.16.0 prompt-toolkit 3.0.38 psutil 5.9.4 ptyprocess 0.7.0 py-cpuinfo 9.0.0 pyarrow 11.0.0 pycparser 2.21 pydantic 1.10.7 Pygments 2.15.0 PyNaCl 1.5.0 pyrsistent 0.19.3 python-dateutil 2.8.2 pytz 2023.3 PyYAML 6.0 pyzmq 25.0.2 qtconsole 5.4.2 QtPy 2.3.1 regex 2022.10.31 requests 2.28.2 responses 0.18.0 rich 13.3.4 Send2Trash 1.8.0 setuptools 65.6.3 six 1.16.0 sniffio 1.3.0 soupsieve 2.4.1 terminado 0.17.1 tinycss2 1.2.1 tokenizers 0.13.3 torch 1.12.1+cu113 torchaudio 0.12.1+cu113 torchvision 0.13.1+cu113 tornado 6.2 tqdm 4.65.0 traitlets 5.9.0 transformers 4.27.4 typing_extensions 4.5.0 urllib3 1.26.15 virtualenv 20.21.0 wcwidth 0.2.6 webencodings 0.5.1 websocket-client 1.5.1 wheel 0.38.4 widgetsnbextension 4.0.7 xxhash 3.2.0 yarl 1.8.2 zipp 3.15.0