yu4u / age-gender-estimation

Keras implementation of a CNN network for age and gender estimation
MIT License
1.47k stars 503 forks source link

Age estimation as regression problem instead of classification #93

Closed aezco closed 5 years ago

aezco commented 5 years ago

I would like to solve the age estimation problem in regression instead of classification.

The current code (classification):

model = Dense(1024, activation='relu')(model) predictions = Dense(101, activation='softmax', name='pred_age')(model) top_model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=[age_mae])

Is this the right way to change it (regression)?:

model = Dense(1024, activation='relu')(model) predictions = Dense(1, activation='linear', name='pred_age')(model) top_model.compile(loss='mse', optimizer=opt, metrics=[age_mae])

yu4u commented 5 years ago

It seems to work. You should also modify data generator to generate regression targets instead of one-hot vectors of target ages.

aezco commented 5 years ago

@yu4u Yes, it works. I have another question about the results. How can I simply understand that I get the optimal results?

I did training on APPA-REAL and UTK for real age and obtained the following results, which is similar to yours:

Schermafbeelding 2019-04-30 om 22 44 35 Schermafbeelding 2019-04-30 om 17 15 44

Although, I also did transfer learning on UTK dataset. I can see that it is overfitting. However, I do not fully understand what the reason could be for this or how I could obtain better results. Transfer learning on UTK dataset for images above age of 50+

Schermafbeelding 2019-05-01 om 11 24 23
yu4u commented 5 years ago

The model seems to have also overfitted to the APPA-REAL and UTK dataset a little bit. Anyway, you can use general approaches to avoid overfitting in training CNNs such as data augmentation, model regularization (e.g. weight decay), decrease model complexity, and so on. I do not know approaches specific to age estimation.

aezco commented 5 years ago

I used data augmentation and got the following graph:

Schermafbeelding 2019-05-01 om 22 55 26

It seems the model fits better, although the MAE is higher compared to the first results. How can the MAE reduced? is it better to remove the noisy data?

yu4u commented 5 years ago

Validation loss seems to be improved. How about using mae for loss instead of mse?