关于制作数据集的问题

kr150 commented 2 months ago

在readme中我看到您说的each line with 4 key-value pairs, including original id, edited id, augmented id, face structural image of edited id，初步理解为原始图像，文本反演后图像，对原始图像进行增强后的图像，以及文本反演后面部结构图

但是在代码中我看到您将数据分为id_pixel_values,warp_pixel_values,makeup_pixel_values,pose_pixel_values,这里的warp_pixel_values

我看到是用在makeup_encoder这一步中，按照论文理解我以为是对文本反演后图像进行增强之后的图像称为

warp_pixel_values，但是看到您在preprocess_train（exampls）这个函数中就已经对id_pixel_values,pose_pixel_values ,makeup_pixel_values作了数据增强，所以想请问warp_pixel_values是怎么的来的

lyggyhmm commented 1 month ago

@kr150 请问训练完成了吗，想看看你的数据集示例

kr150 commented 1 month ago

@kr150 请问训练完成了吗，想看看你的数据集示例

我的训练不太完善，数据集包含三个：原始图像，ledits编辑后的图像，以及在编辑后的图像上提取的面部线条在训练过程中makeup_pixel_values和warp_pixel_values都采用ledits编辑后的图像

Xiaojiu-z commented 2 weeks ago

在readme中我看到您说的each line with 4 key-value pairs, including original id, edited id, augmented id, face structural image of edited id，初步理解为原始图像，文本反演后图像，对原始图像进行增强后的图像，以及文本反演后面部结构图

但是在代码中我看到您将数据分为id_pixel_values,warp_pixel_values,makeup_pixel_values,pose_pixel_values,这里的warp_pixel_values

我看到是用在makeup_encoder这一步中，按照论文理解我以为是对文本反演后图像进行增强之后的图像称为

warp_pixel_values，但是看到您在preprocess_train（exampls）这个函数中就已经对id_pixel_values,pose_pixel_values ,makeup_pixel_values作了数据增强，所以想请问warp_pixel_values是怎么的来的

我使用这个库添加一些随机扰动到每个面部关键点，如鼻子，眼睛和脸。我把源代码留在了上一家公司，所以没有公开，但是实现非常简单，GPT可以帮助你实现它。

kr150 commented 2 weeks ago

在readme中我看到您说的each line with 4 key-value pairs, including original id, edited id, augmented id, face structural image of edited id，初步理解为原始图像，文本反演后图像，对原始图像进行增强后的图像，以及文本反演后面部结构图 但是在代码中我看到您将数据分为id_pixel_values,warp_pixel_values,makeup_pixel_values,pose_pixel_values,这里的warp_pixel_values 我看到是用在makeup_encoder这一步中，按照论文理解我以为是对文本反演后图像进行增强之后的图像称为 warp_pixel_values，但是看到您在preprocess_train（exampls）这个函数中就已经对id_pixel_values,pose_pixel_values ,makeup_pixel_values作了数据增强，所以想请问warp_pixel_values是怎么的来的

我使用这个库添加一些随机扰动到每个面部关键点，如鼻子，眼睛和脸。我把源代码留在了上一家公司，所以没有公开，但是实现非常简单，GPT可以帮助你实现它。

我看到代码中写的是1000epoch，这也太吓人了，即使是8张V100这也得跑上百天吧。我用单张a100，30个epoch需要96小时

Xiaojiu-z commented 2 weeks ago

在readme中我看到您说的each line with 4 key-value pairs, including original id, edited id, augmented id, face structural image of edited id，初步理解为原始图像，文本反演后图像，对原始图像进行增强后的图像，以及文本反演后面部结构图 但是在代码中我看到您将数据分为id_pixel_values,warp_pixel_values,makeup_pixel_values,pose_pixel_values,这里的warp_pixel_values 我看到是用在makeup_encoder这一步中，按照论文理解我以为是对文本反演后图像进行增强之后的图像称为 warp_pixel_values，但是看到您在preprocess_train（exampls）这个函数中就已经对id_pixel_values,pose_pixel_values ,makeup_pixel_values作了数据增强，所以想请问warp_pixel_values是怎么的来的

我使用这个库添加一些随机扰动到每个面部关键点，如鼻子，眼睛和脸。我把源代码留在了上一家公司，所以没有公开，但是实现非常简单，GPT可以帮助你实现它。

我看到代码中写的是1000epoch，这也太吓人了，即使是8张V100这也得跑上百天吧。我用单张a100，30个epoch需要96小时

哈哈，训练可以提前停止，大概在5w-10w steps左右就能看到效果。

Xiaojiu-z / Stable-Makeup

关于制作数据集的问题 #24