djassoRaph / PythonHomework

0 stars 0 forks source link

Thanks! #1

Open coderschoolreview opened 6 years ago

coderschoolreview commented 6 years ago

The goal of this assignment was to introduce you to 2 main concepts in Machine Learning: Data Pre-processing, and Classification. You learned how to query and clean data using the pandas library in Python, and built a simple Machine Learning Classifier based on the K Nearest Neighbors algorithm.

Things you did well:

Things to work on:

Some tips:

coderschoolreview commented 6 years ago

Assignment 2

The goal of this assignment was to introduce you to three new classification techniques and to understand how to select the best parameters and features for them. You learned how to use python built-in functions (GridSearchCV, SelectKBest, RFE, SelectFromModel) to try out new models (Support Vector Machines, Random Forests, and Logistic Regression) and test different permutations of parameter values and features, and analyze your results to help build better machine learning models.

Unfortunately, your Python notebook failed to compile. I got the following error when trying to run all the cells:

NameError: name 'df_audio_features_final' is not defined

Looks like a simple error and is easy for me to fix, but given my workload, I don't have the bandwidth to fix errors in homework just to be able to grade it, so next time please ensure that your homework compiles and runs so that I am able to evaluate and provide you with feedback!

Overall though, even though I know you are overwhelmed, this assignment does seem like you have made some progress (I know you said you 'cheated' on this one, so I guess you are the best judge of how accurate my comment actually is :) )

Again, if you need help, we're always here for you, regardless of whatever form that help may take -- i.e. either pushing you to be better, or allowing you to be an observer in the class and re-take it next semester at no extra cost :) Please let me know by Friday whatever you decide.

Cheers!

coderschoolreview commented 6 years ago

Assignment 3

The goal of this assignment was to introduce you to three new Natural Language Processing techniques, and to understand how to perform some basic sentiment analysis on song lyrics using these methods. You learned how to clean and prepare textual information for NLP, and then apply the following approaches: Bag Of Words, TF-IDF, and Doc2Vec. You used your prior knowledge of Python estimators, feature selection, and parameter optimization techniques to produce feature vectors from these NLP methods to make predictions on the moods of songs using their lyrics.

This is the most progress you've made on an assignment so far, so good job doing that!

You made one mistake which unfortunately caused the rest of your assignment to spiral out of control. It is a simple mistake and EASILY avoidable if you simply read/check your work carefully!

When creating your train_test_split, X needs to be your features and y needs to be your labels. So in this assignment, your features are your bag of words / tf-idf vectors, and your labels are your moods. However, you set your labels to be the lyrics...

What you did: X = bag_of_words y = df_happy_n_sad['lyrics_features']

But what it should have been: X = bag_of_words y = df_happy_n_sad['simplified_moods']

Read your work carefully next time. It's a shame because the above fix would have probably allowed you to complete the entire assignment without a problem!

Cheers