This is a very interesting and advanced topic. Like you said, if someone can expand the project into predicting not only gender but also other things such as age, hometown and occupation etc. based only on voice, then it will be a pretty sophisticated artificial intelligent problem.
Understanding the data set and generating hypotheses requires some basic physics knowledge about human voice and sound, which I think is a difficult part of this project. But it seems like you have already made some progress on cleaning the data set.
The graphics you generate from the data look great. From those graphs, I have a better sense of how male and female voices are different. You also found “meanfun” and “Q25” to be important by looking at those graphics.
I personally have limited knowledge on human voice recognition, so I think this project is very demanding. Both the model fitting part and the model interpretation part are difficult. If I am doing this project, I would probably need to spend a long time trying different feature engineering techniques to find a good model and spend an equal amount of time trying to generate good interpretation of the model selected.
I am really looking forward to seeing the final result of the project. Keep up with the good work!
This is a very interesting and advanced topic. Like you said, if someone can expand the project into predicting not only gender but also other things such as age, hometown and occupation etc. based only on voice, then it will be a pretty sophisticated artificial intelligent problem.
Understanding the data set and generating hypotheses requires some basic physics knowledge about human voice and sound, which I think is a difficult part of this project. But it seems like you have already made some progress on cleaning the data set.
The graphics you generate from the data look great. From those graphs, I have a better sense of how male and female voices are different. You also found “meanfun” and “Q25” to be important by looking at those graphics.
I personally have limited knowledge on human voice recognition, so I think this project is very demanding. Both the model fitting part and the model interpretation part are difficult. If I am doing this project, I would probably need to spend a long time trying different feature engineering techniques to find a good model and spend an equal amount of time trying to generate good interpretation of the model selected.
I am really looking forward to seeing the final result of the project. Keep up with the good work!