gmihaila / ml_things

This is where I put things I find useful that speed up my work with Machine Learning. Ever looked in your old projects to reuse those cool functions you created before? Well, this repo is designed to be a Python Library of functions I created in my previous project that can be reused. I also share some Notebooks Tutorials and Python Code Snippets.
https://gmihaila.github.io
Apache License 2.0
245 stars 61 forks source link

predictions - metrics #6

Closed shainaraza closed 2 years ago

shainaraza commented 3 years ago

Hi @gmihaila any suggestions on saving the predictions of the model. Also you are using sklearn.metrics, so I think we can use more metrics from the sklearn.metrics

thanks

gmihaila commented 3 years ago

@shainaraza once you have the predictions of the model you can save them in what ever format you need: in a text file, csv, pickle etc.

Yes, you can use any metrics you want and make sense for your experiments.

I personally only compute de metric and save it instead of all the predictions.

shainaraza commented 3 years ago

thanks @gmihaila another quick question: dont know my model accuracy increases when i make it binary classification, with multiclass, it drops badly

all the changes are adjusted with multiclass for example. change to label numbers, does it have to do something with scoring function?

gmihaila commented 3 years ago

@shainaraza I'm not sure what are you doing classification on but if you drop one of the labels it's ok if the accuracy jumps. That means you dropped a label that was hard for the model to classify and also you get higher accuracy when you deal with fewer labels - less mistakes counted by the model.

shainaraza commented 3 years ago

@shainaraza I'm not sure what are you doing classification on but if you drop one of the labels it's ok if the accuracy jumps. That means you dropped a label that was hard for the model to classify and also you get higher accuracy when you deal with fewer labels - less mistakes counted by the model.

it is solved now, the reason was that the labels were not balanced, I mean imbalanced dataset, I did undersampling.

thanks once again

gklabs commented 2 years ago

@gmihaila thanks for the great notebook, in the same theme of saving predictions, does trainer.evaluate() return predictions for the entire validation set? Is there a way to save them?

gmihaila commented 2 years ago

@gklabs Thank you for your interest in my notebook! 😄 The trainer.evlauate() only returns the evaluations on the validation set. If you want to get the predictions you can use trainer.predict. Check it out here. Let me know if this helps!