Closed liyiyue2021 closed 1 year ago
Hi, @liyiyue2021 ! Thank you for your focus. To use your own dataset, you only need to generate a new mapping file. The file has the following data form:
{"index 01":
{"img_path": "the path to the image", "prompt": "the caption of the image", "pose_path": "the path to the pose npz file"},
"index 02":
{"img_path": "the path to the image", "prompt": "the caption of the image", "pose_path": "the path to the pose npz file"},
"index 03":
{"img_path": "the path to the image", "prompt": "the caption of the image", "pose_path": "the path to the pose npz file"},
...}
Then, add data confiig in yaml file. For example:
data:
target: main.DataModuleFromConfig
params:
batch_size: 4
num_workers: 1
wrap: False
train:
target: ldm.data.humansd.HumanSDPoseTrain
params:
image_size: 512
map_file: the mapping file for training data
base_path: the base path of images/poses' npz file in training mapping file
max_person_num: 10
keypoint_num: 17
keypoint_dim: 3
skeleton_width: 10
keypoint_thresh: 0.1
pose_skeleton: [
[0,0,1],
[1,0,2],
[2,1,3],
[3,2,4],
[4,3,5],
[5,4,6],
[6,5,7],
[7,6,8],
[8,7,9],
[9,8,10],
[10,5,11],
[11,6,12],
[12,11,13],
[13,12,14],
[14,13,15],
[15,14,16],
]
validation:
target: ldm.data.humansd.HumanSDPoseValidation
params:
image_size: 512
map_file: the mapping file for validation data
base_path: the base path of images/poses' npz file in validation mapping file
max_person_num: 10
keypoint_num: 17
keypoint_dim: 3
skeleton_width: 10
keypoint_thresh: 0.1
pose_skeleton: [
[0,0,1],
[1,0,2],
[2,1,3],
[3,2,4],
[4,3,5],
[5,4,6],
[6,5,7],
[7,6,8],
[8,7,9],
[9,8,10],
[10,5,11],
[11,6,12],
[12,11,13],
[13,12,14],
[14,13,15],
[15,14,16],
]
Hi, thanks for sharing the great work! Could you provide the steps for using this method for generating the other type of object ? ( like facial landmark or hand pose )
Hi, @asd33015 ! We are working on it recently. But if you want to train your own model with other type of object, what you need to do is only change the pose image to target image (with facial landmark or hand pose).
How can we use this method for other types of datasets, I mean, other types of image generation (not human generation). This method may need to be pre-trained on a new dataset