Mq-Zhang1 / HOIDiffusion

Official Code Release for HOIDiffusion (CVPR 2024)
MIT License
24 stars 1 forks source link

Wrong number of fingers generated #5

Open LDS666888 opened 1 week ago

LDS666888 commented 1 week ago

ToyCar_0_2_4 I use the following command to generate the image:

torchrun --nproc_per_node=1 --nnodes=1 --node_rank=0 --master_addr="localhost" --master_port=59726 test_dex.py --which_cond d
ex --bs 1 --cond_weight 1 --sd_ckpt "F:\data_enhancement\HOIDiffusion-main\stable_difussion_model\sd-v1-4.ckpt" --cond_tau 1 --adapter_ckpt "F:\data_enhancement\HOIDiffusion-main\midas
_models\t2iadapter_depth_sd14v1.pth" --cond_inp_type image --input "F:\data_enhancement\HOIDiffusion-main\output\depth" --file "F:\data_enhancement\HOIDiffusion-main\output\train.csv" --outdir "F:\data_enhancement\HOIDiffusion-main\test_dex_outdir

The Settings in inference_base.py are as follows:

def get_adapters(opt, cond_type: ExtraCondition):
    adapter = {}
    cond_weight = getattr(opt, f'{cond_type.name}_weight', None)
    if cond_weight is None:
        cond_weight = getattr(opt, 'cond_weight')
    adapter['cond_weight'] = cond_weight

    adapter['model'] = CoAdapter(w1 = 0, w2 = 1, w3 = 0).to(opt.device) 

    ckpt_path = getattr(opt, f'{cond_type.name}_adapter_ckpt', None)
    if ckpt_path is None:
        ckpt_path = getattr(opt, 'adapter_ckpt')
    state_dict = read_state_dict(ckpt_path)
    new_state_dict = {}
    for k, v in state_dict.items():
        if k.startswith('adapter.'):
            new_state_dict[k[len('adapter.'):]] = v
        else:
            new_state_dict[k] = v
    # 如果某些键名没有前缀,可以手动添加
    for k, v in state_dict .items():
        if not k.startswith('depth_ada.'):
            new_state_dict['depth_ada.' + k] = v
            del new_state_dict[k]

    adapter['model'].load_state_dict(new_state_dict)

    return adapter

The resulting picture had the wrong number of fingers, and some of the hands did not touch the object: ToyCar_0_0_9 ToyCar_0_0_6 ToyCar_0_2_6

Mq-Zhang1 commented 3 days ago

From command, It seems you directly use the original sd1.4 and depth condition model from t2i-adapter to generate. It's hard without any training.