Unexpected key(s) in state_dict: "cond_stage_model.transformer.text_model.embeddings.position_ids".

sooxp commented 5 months ago

Hi， Thanks for making the project code open source! when I executed scripts/inference_any_image_pose.sh, I received the following error message:


Loaded model config from [model_lib/ControlNet/models/cldm_v15_reference_only_pose.yaml]
Total base  parameters 2288.11M
find model state dict from pretrained_weights/model_state-110000.th ...
Loading model state dict from pretrained_weights/model_state-110000.th ...
Traceback (most recent call last):
  File "/data/workstation/MagicDance/test_any_image_pose.py", line 577, in <module>
    main(args)
  File "/data/workstation/MagicDance/test_any_image_pose.py", line 371, in main
    load_state_dict(model, args.image_pretrain_dir,strict=True) 
  File "/data/workstation/MagicDance/test_any_image_pose.py", line 126, in load_state_dict
    model.load_state_dict(state_dict, strict=strict)
  File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ControlLDMReferenceOnlyPose:
        Unexpected key(s) in state_dict: "cond_stage_model.transformer.text_model.embeddings.position_ids". 
[2024-02-18 05:16:42,066] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 34957) of binary: /home/user01/miniconda3/envs/dpe/bin/python3
Traceback (most recent call last):
  File "/home/user01/miniconda3/envs/dpe/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
    run(args)
  File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
    elastic_launch(
  File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
test_any_image_pose.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-02-18_05:16:42
  host      : user01-wt
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 34957)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

What's wrong? Thx.

Boese0601 commented 5 months ago

Hi,

Can u double-check if the package version of your diffusers and transformers matches the environment.yml?

If so, you can also set the value of strict to False, it shouldn't affect the result since we actually didn't use any text as input to the "text_model" in our pipeline.

Let me know if you have any further questions.

sooxp commented 5 months ago

solved. the problem is with the version of transformers. Thx

zhjj666 commented 2 months ago

Hi,

Can u double-check if the package version of your diffusers and transformers matches the environment.yml?

If so, you can also set the value of strict to False, it shouldn't affect the result since we actually didn't use any text as input to the "text_model" in our pipeline.

Let me know if you have any further questions.

I have checked the versions of diffusers and transformers, and also tried to change strict to False, but the above error still appears

goshiaoki commented 1 month ago

Hi, Can u double-check if the package version of your diffusers and transformers matches the environment.yml? If so, you can also set the value of strict to False, it shouldn't affect the result since we actually didn't use any text as input to the "text_model" in our pipeline. Let me know if you have any further questions.

I have checked the versions of diffusers and transformers, and also tried to change strict to False, but the above error still appears

Same question. My versions of the transformers, diffusers and pytorch are following.

(magicpose) 10-76-1-24% pip show transformers
Name: transformers
Version: 4.22.1

(magicpose) 10-76-1-24% pip show diffusers
Name: diffusers
Version: 0.11.1

(magicpose) 10-76-1-24% pip show torch  
Name: torch
Version: 1.13.1

goshiaoki commented 1 month ago

When I delete all and download from scratch, it works. Thank you.

aycaecemgul commented 2 days ago

Hi,

Can u double-check if the package version of your diffusers and transformers matches the environment.yml?

If so, you can also set the value of strict to False, it shouldn't affect the result since we actually didn't use any text as input to the "text_model" in our pipeline.

Let me know if you have any further questions.

Mentioning environment.yml in the readme could be helpful,

Boese0601 / MagicDance

Unexpected key(s) in state_dict: "cond_stage_model.transformer.text_model.embeddings.position_ids". #7