This project is ambitious and uses data we haven't really covered in this course. I think the project parameters would be benefitted more from an explanation of why being able to differentiate male and female voices is important. It sounds like it is a necessary step towards voice recognition, but more justification would have helped clarify on the difficulties of the project.
The overview of the data set was concise and identified key observations on quantity and quality that reflected the relative comprehensiveness of the data. Good job validating the data wasn't corrupt or missing entries. Explanation of the features was clear and much needed.
The section on exploratory data analysis is interesting. The method you use of classifying by median to identify important features seems quite novel, and is a good initial step to better understanding the data.
Following that, you fit a lot of different models to the data and it was great to see how PCA helped dramatically increase the success rate.
As someone else mentioned, it seems important what the ratio of male-to-female samples were, especially considering the small sample size. Additionally, it would have been interesting if you considered age and other indicators like ethnicity, language spoken. Young boys have higher pitched voices that might be indistinguishable from girls.
Overall, good job on taking on a hard project and applying concepts from inside and outside of the scope of this class!
This project is ambitious and uses data we haven't really covered in this course. I think the project parameters would be benefitted more from an explanation of why being able to differentiate male and female voices is important. It sounds like it is a necessary step towards voice recognition, but more justification would have helped clarify on the difficulties of the project.
The overview of the data set was concise and identified key observations on quantity and quality that reflected the relative comprehensiveness of the data. Good job validating the data wasn't corrupt or missing entries. Explanation of the features was clear and much needed.
The section on exploratory data analysis is interesting. The method you use of classifying by median to identify important features seems quite novel, and is a good initial step to better understanding the data. Following that, you fit a lot of different models to the data and it was great to see how PCA helped dramatically increase the success rate.
As someone else mentioned, it seems important what the ratio of male-to-female samples were, especially considering the small sample size. Additionally, it would have been interesting if you considered age and other indicators like ethnicity, language spoken. Young boys have higher pitched voices that might be indistinguishable from girls.
Overall, good job on taking on a hard project and applying concepts from inside and outside of the scope of this class!