Running on emotion recognition dataset with multiple people in one image

Hello, I have a short question about using your custom dataset. I have followed the instructions about adapting TemplateDataset to your specific dataset. The dataset I am working on contains emotion labels for each respective person in the image - multiple people can be portrayed in one image, thus multiple emotion labels per image exist. How should this case be treated? So far, I tried this:


 def from_path(self, file_path):

             ......

        annotation_mat_file_name = annotation_path +image_id.replace('.jpg', '.mat')
        mat_file = loadmat(annotation_mat_file_name)    

        people = mat_file['people']    

        gender_info = []
        image_anns = []
        for p in range(0,len(people)-1):
            person_bbox = bbox = people['person'][p]['bbx']
            gender = people['person'][p]['gender']
            new_ann = {'bbox': person_bbox, 'label': people['person'][p]['cats']}
            image_anns.append(new_ann)
            gender_info.append([gender, self.biggest_bbox])
        country = None

        scene_group = self.scene_mapping[(self.img_folder + file_path)] # optional
        anns = [image_anns, gender_info, [country], (self.img_folder + file_path), scene_group]

        return image, anns

I saved each labelling per image in a .mat file. People is a struct with numofpersons entries, each containing a bounding box of the person, its gender, labels and other entries. The problem is the following: The lists inside of 1.pkl (and my guess is, also for the other measurements) are empty. Presumably, because the annotations I pass during from_path are incorrect.

Do you have any suggestions on how to proceed? Should I save one .mat file per annotated person and thus treat each person and its corresponding bounding box as a separate image?

I will be very greatful for your input on this matter and I am very looking forward to your reply.

Best regards, Nastassia

Hi Nastassia,

As of now, our tool does not allow for more than one person per image, so gender_info is expected to be a single tuple of the form (gender, bbox). I think there are two options for how to deal with your case:

Exactly as you suggested, have each person and their corresponding bounding box as a separate image
If the background and context if each image is important, and only using the bounding boxes would remove this information, you could have repeats of each image with a different person’s annotations reflected in each. For example, if there were 3 people in image 1, you could have 3 replicated instances of image 1, each of them with a different gender annotation because each image corresponds to a different person.

Also, image_anns typically refers to the non-person annotations in a dataset (e.g., tennis ball, table), but I see you are using it to document human annotations. That should also be okay, but just wanted to give you a head's up.

Hope this helps :)

princetonvisualai / revise-tool

Running on emotion recognition dataset with multiple people in one image #13