I need to develop a plan for completing my final project for my ENTITY Academy data science certification. This should be as thorough as possible and include all steps for preparing to create, creating, and presenting the project. This issue is fluid, and updating it as the project evolves is expected.
[ ] Objective
~ To create an email classifier (what is a more general term for this?) that separates valuable emails from unwanted emails, commonly referred to as a "spam filter."
~ Preprocessing to be done using NLP methods. This increases efficiency and accuracy.
~ Model will be built using a statistical algorithm based on Bayes Theorem.
~ All work will be compiled within a GitHub repository and summarized in a video presentation.
[ ] Resources
~ Code, datasets, tutorials, and related info publicly available on Kaggle, GitHub, and KDnuggets.
~ ENTITY Academy (EA)/ Woz-U data science curriculum
~ EA instructor, Dr. Mo
~ Mentors and students on the EA Slack channels
~ Videos on the EA/Woz-U Vimeo channel
[ ] Software and applications I need to have at least basic knowledge of how to use
~ Python
~ Microsoft Office
~ Tableau?
~ Vimeo?
[ ] Analyses needed
~ Preprocessing techniques: lowercasing, stemming, noise removal
~ Modeling - more research needed to figure out the best approaches
[ ] Hardware and accessories needed for project and presentation
~ Laptop with webcam and mic
~ Laptop speakers are poor. Need to confirm that the mic is not also substandard.
~ Need nice background for video, either physical or digital
~
I need to develop a plan for completing my final project for my ENTITY Academy data science certification. This should be as thorough as possible and include all steps for preparing to create, creating, and presenting the project. This issue is fluid, and updating it as the project evolves is expected.