How to create custom data for training ?

kdhht2334 / ELIM_FER

[NeurIPS 2022] The official repository of Expression Learning with Identity Matching for Facial Expression Recognition

MIT License

34 stars 4 forks source link

How to create custom data for training ? #1

Closed Doanhdz closed 1 year ago

Doanhdz commented 1 year ago

First thanks for your great work. I have a question about the dataset annotation and I need your help. Now that I have 10000 images that don't have labels. How I could label my dataset to apply to your model ? Hope that you can explain it to me.

kdhht2334 commented 1 year ago

Basically, our FER model operates in the valence-arousal (VA) space.

So, the ground truth (GT) should be annotated for both valence and arousal respectively.

For example, an example file training.csv is as follows.

filePath	valence	arousal
1/001/1.png	0.2	-0.1
...	...	...

Here, the path of filePath can be set arbitrarily by the user (e.g., 001/1.png).

Fore more details, please refer this web site :) https://ibug.doc.ic.ac.uk/resources/afew-va-database/

Doanhdz commented 1 year ago

Sorry but which i want you explain to me is how do the image is annotated with 0.2 (valence) and 0.1 (arousal) ? What is it based on ? I have too many data with unlabeled, but I dont't know the way to annotate the value of valence and arousal

kdhht2334 commented 1 year ago

First of all, it is difficult to annotate valence and arousal values directly from a given image. Unlike common problems such as classification tasks, emotion annotation is not easy and involves some qualitative subjectivity.

There are roughly two options.

1) Self-annotation: Having the subject of the image annotate their feelings at that moment. The meaning of the valence and arousal axes must be taught in advance.

2) Hire skilled annotators

The criteria for annotation in VA space can be found in the paper below:

J. Posner et al. The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology, 2005.

Doanhdz commented 1 year ago

Thanks for your answer, I will try it?

kdhht2334 commented 1 year ago

Sorry for the late reply :)

Sure!! But be careful below things:

When annotating Valence-Arousal (VA) values, if the proportion of outlier samples is large, the model may not be trained properly.
Here, an outlier sample is a case where the absolute value of the VA value is too large compared to other samples.

Fingers crossed!

Doanhdz commented 1 year ago

So what you mean is that I need to collect balanced datasets that fill all the VA space ?

kdhht2334 commented 1 year ago

Yes.

The best way is to collect the dataset with a distribution similar to Figure 4 in the paper below:

https://arxiv.org/pdf/1708.03985.pdf

Doanhdz commented 1 year ago

Thanks for your support, I got it.