Open shenwuyue2022 opened 4 months ago
Hi, for our work, we use the multi-modal labels provided by datasets that are built on top of CelebA, for example CelebA-Dialog, CelebAMask-HQ, and Multi-Modal-CelebA-HQ. If you wish to extract new multi-modal labels, you can some off-the-shelf extractors. For example, there is a face parsing network provided by CelebAMask-HQ.
Hi, for our work, we use the multi-modal labels provided by datasets that are built on top of CelebA, for example CelebA-Dialog, CelebAMask-HQ, and Multi-Modal-CelebA-HQ. If you wish to extract new multi-modal labels, you can some off-the-shelf extractors. For example, there is a face parsing network provided by CelebAMask-HQ.
Hello! Regarding the mask and sketch parts of the CelebA dataset mentioned in this program, the mask part in the CelebA dataset consists of images, and in the program, they are converted into .pt format files, including the sketch part which is also converted into files. Could you please provide the specific conversion code for this part? Also, regarding the description of the folder under mask, combined with the final .pt shape being [19,1024], does 19 correspond to 19 categories, and does 1024 correspond to the downsampled 32*32?
Also, regarding the description of the folder under mask, combined with the final .pt shape being [19,1024], does 19 correspond to 19 categories, and does 1024 correspond to the downsampled 32*32?
Yes that's right.
Also, regarding the description of the folder under mask, combined with the final .pt shape being [19,1024], does 19 correspond to 19 categories, and does 1024 correspond to the downsampled 32*32?
Yes that's right.
Hello, here is my understanding of the process for converting mask image files into .pt files, please help me check if there are any issues. `def resize_and_convert_to_tensor(file_path, output_dir, num_classes=19): transform = transforms.Compose([ transforms.Resize((32, 32), interpolation=transforms.InterpolationMode.NEAREST), transforms.ToTensor() ])
image = Image.open(file_path).convert('L')
downsampled_tensor = transform(image)
downsampled_map = downsampled_tensor.view(-1).numpy()
one_hot_tensor = torch.zeros((num_classes, 1024), dtype=torch.float32)
for idx, pixel_value in enumerate(downsampled_map):
class_index = int(pixel_value)
if class_index < num_classes:
one_hot_tensor[class_index, idx] = 1
base_name = os.path.splitext(os.path.basename(file_path))[0]
tensor_file_path = os.path.join(output_dir, f"{base_name}.pt")
torch.save(one_hot_tensor, tensor_file_path)
return tensor_file_path`
How to create text、mask、sketch about my own image