yu4u / age-gender-estimation

Keras implementation of a CNN network for age and gender estimation
MIT License
1.46k stars 502 forks source link

Age on dataset imdb and wiki #15

Open adamaulia opened 6 years ago

adamaulia commented 6 years ago

Hello, I found strange things on dataset maximum age is 2014 and minimum age is -31. I tried using your code in https://github.com/yu4u/age-gender-estimation/blob/master/create_db.py in line 34 and I add several code

mat_path = "../imdb.mat"
db = "imdb"
full_path, dob, gender, photo_taken, face_score, second_face_score, age\
    = get_meta(mat_path, db)

path_list = [str(item[0]) for item in full_path.tolist()]
max_age = np.max(age) #result 2014
max_age_idx = np.argmax(age) #result 181492
path_photo = path_list[max_age_idx] #result '64/nm1002664_rm1109196544_0-7-31_2015.jpg'

and link dataset I download is https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/static/imdb_crop.tar

yu4u commented 6 years ago

As described in the website, the dataset was automatically created via crawling. The ages in the dataset are estimated values calculated from dob (date of birth) and photo_taken (Exif). Therefore, the resulting metadata may be not so accurate.

Of course, we can not vouch for the accuracy of the assigned age information. Besides wrong timestamps, many images are stills from movies - movies that can have extended production times

adamaulia commented 6 years ago

so, If I want to train image with exact age, I must do manual filter, right ?

yu4u commented 6 years ago

Yes, as it is impossible to know exact ages, what we can do is filter out unreliable ages manually as you mentioned or automatically (but not perfect) as done in create_db.py. You can use this dataset for pre-train, and then fine-tune the model using more reliable (but smaller) dataset (I had not tried it). http://chalearnlap.cvc.uab.es/dataset/19/description/

adamaulia commented 6 years ago

For training age estimation are you using categorical cross entropy or regression ?

yu4u commented 6 years ago

Cross entropy is used as proposed in the original paper:

        predictions_a = Dense(units=101, kernel_initializer=self._weight_init, use_bias=self._use_bias,
                              kernel_regularizer=l2(self._weight_decay), activation="softmax")(flatten)
edumucelli commented 6 years ago

Interesting discussion, but have you guys manually inspected the data? By doing so I see most of the photos make no sense for the age they are set to. For me it is impossible to the network to learn something reliable from such data. Maybe I am missing something here, but for example, none of the images I see under 10 years old actually belongs to a under-than-10 year old person on the IMDB dataset.

yu4u commented 6 years ago

You are right. The labels in the IMDB-WIKI dataset are noisy because it is was automatically created from web sites. It is originally created for the purpose of pre-training; training on cleaner datasets such as the APPA-REAL Dataset (https://github.com/yu4u/age-gender-estimation/tree/master/appa-real) is assumed.

edumucelli commented 6 years ago

I see. Good to know about the APPA-REAL dataset, I did not knew it. Thanks for the info!

NSavov commented 6 years ago

Thanks for the repo and all the effort! I get that you have below 4 MAE with other datasets, but I am curious what MAE score you get for training/validation on the noisy IMDB-WIKI with this implementation. In my own implementation, when training on 14k filtered images from Wiki, with balanced distribution and batches, no augmentation, with VGG-16 and regression it converges to 5.2 MAE (validation), with AlexNet (used for multi-task purposes with another network) and regression - 5.6, and with AlexNet classification - 5.8. For my use, having the lowest score is not important but still, it seems alarmingly high and I would like to compare if you have the numbers. Thanks!

yu4u commented 6 years ago

As I don't know the difference between your implementation and this project, I can't say for certain, but; 1) the imdb dataset is relatively better (cleaner) than the wiki dataset, 2) classification+expectation is better than regression if you are solving age estimation as regression problem.

voqtuyen commented 5 years ago

Cross entropy is used as proposed in the original paper:

        predictions_a = Dense(units=101, kernel_initializer=self._weight_init, use_bias=self._use_bias,
                              kernel_regularizer=l2(self._weight_decay), activation="softmax")(flatten)

@yu4u Why dont you compute the loss between the label and softmax expected value like in the demo.py:

results = model.predict(sub_test_images)
predicted_genders = results[0]
ages = np.arange(0, 101).reshape(101, 1)
predicted_ages = results[1].dot(ages).flatten()