IDEA-Research / HumanSD

[ICCV 2023] The official implementation of paper "HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation"
Apache License 2.0
275 stars 18 forks source link

How can we use this method for other types of datasets #7

Closed liyiyue2021 closed 1 year ago

liyiyue2021 commented 1 year ago

How can we use this method for other types of datasets, I mean, other types of image generation (not human generation). This method may need to be pre-trained on a new dataset

juxuan27 commented 1 year ago

Hi, @liyiyue2021 ! Thank you for your focus. To use your own dataset, you only need to generate a new mapping file. The file has the following data form:

{"index 01": 
    {"img_path": "the path to the image", "prompt": "the caption of the image", "pose_path": "the path to the pose npz file"}, 
"index 02": 
    {"img_path": "the path to the image", "prompt": "the caption of the image", "pose_path": "the path to the pose npz file"}, 
"index 03": 
    {"img_path": "the path to the image", "prompt": "the caption of the image", "pose_path": "the path to the pose npz file"}, 
...}

Then, add data confiig in yaml file. For example:

data:
  target: main.DataModuleFromConfig
  params:
    batch_size: 4
    num_workers: 1
    wrap: False
    train:
      target: ldm.data.humansd.HumanSDPoseTrain
      params:
        image_size: 512
        map_file: the mapping file for training data
        base_path: the base path of images/poses' npz file in training mapping file
        max_person_num: 10
        keypoint_num: 17
        keypoint_dim: 3
        skeleton_width: 10
        keypoint_thresh: 0.1
        pose_skeleton: [
                [0,0,1],
                [1,0,2],
                [2,1,3],
                [3,2,4],
                [4,3,5],
                [5,4,6],
                [6,5,7],
                [7,6,8],
                [8,7,9],
                [9,8,10],
                [10,5,11],
                [11,6,12],
                [12,11,13],
                [13,12,14],
                [14,13,15],
                [15,14,16],
            ]
    validation:
      target: ldm.data.humansd.HumanSDPoseValidation
      params:
        image_size: 512
        map_file: the mapping file for validation data
        base_path: the base path of images/poses' npz file in validation mapping file
        max_person_num: 10
        keypoint_num: 17
        keypoint_dim: 3
        skeleton_width: 10
        keypoint_thresh: 0.1
        pose_skeleton: [
                [0,0,1],
                [1,0,2],
                [2,1,3],
                [3,2,4],
                [4,3,5],
                [5,4,6],
                [6,5,7],
                [7,6,8],
                [8,7,9],
                [9,8,10],
                [10,5,11],
                [11,6,12],
                [12,11,13],
                [13,12,14],
                [14,13,15],
                [15,14,16],
            ]
asd33015 commented 1 year ago

Hi, thanks for sharing the great work! Could you provide the steps for using this method for generating the other type of object ? ( like facial landmark or hand pose )

juxuan27 commented 1 year ago

Hi, @asd33015 ! We are working on it recently. But if you want to train your own model with other type of object, what you need to do is only change the pose image to target image (with facial landmark or hand pose).