Open scott-yj-yang opened 1 year ago
Our project is only trying to predict the price range of the mobile without taking consideration of the actual prices of the mobile phone, so when we are training the dataset, our actual results will be integer number from 0 to 3 indicating the actual prices range from low price value to very high price value. We understand that what you are saying is that if the actual range is 0, the prediction of 1 in less wrong than the prediction of a 2 or 3. But if we change our evaluation metric to mean squared error, that sounds very weird since our output are categories (number 0, 1, 2, and 3). Do you have any suggestion on how should we formulate this?
@eleeeysh
Hi sorry for the late reply.
It's fine you define the problem as a classification problem and evaluate the models as any ordinary classification model. But since ordinal regression somehow sits between classification and numerical regression, I would suggest that in addition to ordinal accuracy (e.g precision, recall, etc) you may also consider measuring your model using metrics for a numerical target (e.g. MAE, etc.) and see (1) what if during model training we punish misclassification more if the prediction is further from the ground truth (i.e. if a sample's true label is 1, the loss is higher if it's predicted as 5 than predicted as 3)? Will MAE decrease? How will precision/recall be affected? (2) what if instead of directly fitting a classification model, we first fit a linear regression model, then convert the output to category? (e.g. predict price=$2000 ==> category=3) How would precision/recall/MAE change?
These are something I can think of to explore considering this is an ordinal regression problem; you don't have to do this and you can definitely try anything you feel that makes sense, and It's fine that in the end that you find just to treat these categories as any normal categories and to apply only the traditional classification model yields the best result. My point is just you need to show that you realize this is not a common classification problem and what are the potential issues and how you are going to address the issues.
evaluation metric: +0.75
Project Proposal Feedback
Score (out of 9)
Score = 9
Feedback:
Rubric
Scoring: Out of 9 points
If students address the detailed feedback in a future checkpoint, they will earn these points back.
Comments
I understand you are using these evaluation metrics because you formulate the problem as a classification problem. But treating price ranges as different classes sounds a bit weird to me? Probably I have missed it but you need to take the relationship between price ranges into consideration both during training and evaluation. For example, if the actual price of a phone is $650, predicting a price range of 700-1500 is not too wrong, but a prediction of 2000-2800 definitely should be considered as 'very wrong'. Therefore you need to rethink about whether evaluation metrics like TP/FP etc. are suitable.