Closed jadechip closed 6 months ago
After studying the training code in a bit more detail, am I correct that? Each line in the JSON file represents a single data sample. Each data sample should be a JSON object containing the following fields:
file_name: The path to the image file relative to the --data_root_path. additional_feature: The text description or additional features associated with the image. bbox: The bounding box coordinates of the face in the image. landmarks: The facial landmark coordinates. insightface_feature_file: The path to the corresponding InsightFace feature file relative to the --data_root_path?
Our private dataset consists of portrait data, which may include single or multiple individuals. During the data processing, landmarks detection is only performed on the largest facial area. Subsequently, the labels are created using blip, and the processing method for the JSON files is exactly the same as what you mentioned.
Thank you @chenxinhua, I will running a quick training job on LAION-Face.
@jadechip Hello, may I ask how to create a JSON file? Given a dataset, how should I handle it to start training? I have seen using Blip to create text descriptions, what should I do next? Thank you very much!
Hi and thank you for your amazing contribution!
I understand you have trained the model on a private dataset, but I am wondering if you could provide some details on the format of dataset? For example should I use cropped face images like LAION-Face and corresponding facial landmark annotations? ...and how many images are required to obtain good results?
Thank you 🙏