diegocasmo / iis-project

0 stars 0 forks source link

KNN Classifier #12

Closed ankurshukla03 closed 6 years ago

ankurshukla03 commented 6 years ago
tristaaan commented 6 years ago

Great job! Deleting the omnibus lm3-classifier file might have been jumping the gun a bit (would do that last so you can keep referencing it) Looks like creating new classifier files from here out will be much easier!

tristaaan commented 6 years ago

Edited last comment, didn't see that it was added back. LGTM

diegocasmo commented 6 years ago

@ankurshukla03 @tristaaan

Is get_train_test_split repeated in all notebooks? Could be useful to place it in just one place, instead of re-declaring it in each notebook

diegocasmo commented 6 years ago

@ankurshukla03 @tristaaan

As a matter of fact, it might be even cooler if instead of returning (train_df, test_df) from get_train_test_split, return instead (train_data, train_labels, test_data, test_labels). This way we don't have to unpack those values every time (this logic also seems to be repeated across all notebooks)

ankurshukla03 commented 6 years ago

yeah i decided to add that file back if in by any chance we need that.

Great job! Deleting the omnibus lm3-classifier file might have been jumping the gun a bit (would do that last so you can keep referencing it) Looks like creating new classifier files from here out will be much easier!

ankurshukla03 commented 6 years ago

@diegocasmo you are right get_train_test_split is used in every notebook with the classifier. It's better to make a separate file like preprocess.py so we don't have to make a same cell in every notebook. Returning this (train_data, train_labels, test_data, test_labels) will be much better. I will try to do all that in a python file and use that in a notebook.

As a matter of fact, it might be even cooler if instead of returning (train_df, test_df) from get_train_test_split, return instead (train_data, train_labels, test_data, test_labels). This way we don't have to unpack those values every time (this logic also seems to be repeated across all notebooks)

tristaaan commented 6 years ago

What is preprocess_2 and does it replace preprocess?

ankurshukla03 commented 6 years ago

preprocess_2 is for lm2 dataset and preprocess is for lm3 dataset. Both are for prerpocessing but for different dataset.

What is preprocess_2 and does it replace preprocess?

tristaaan commented 6 years ago

aha, they should be renamed to represent such. e.g. : preprocess_lm3 preprocess_lm2

If the methods are similar enough you could make it single file and just check the dimensionality of the input and append each time too but maybe that's pedantic.

-Tristan Wright

On Sun, May 20, 2018 at 4:30 PM, Ankur Shukla notifications@github.com wrote:

preprocess_2 is for lm2 dataset and preprocess is for lm3 dataset. Both are for prerpocessing but for different dataset.

What is preprocess_2 and does it replace preprocess?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/diegocasmo/iis-project/pull/12#issuecomment-390486981, or mute the thread https://github.com/notifications/unsubscribe-auth/AFN5myaO1YFUM83LmjrJqjtKcO1CIyWjks5t0X4jgaJpZM4UE_9c .

ankurshukla03 commented 6 years ago
ankurshukla03 commented 6 years ago

@diegocasmo