scott-yj-yang commented 1 year ago

Project Proposal Feedback

Score (out of 9)

Score = 9

Feedback:

	Quality	Reasons
Abstract
Background
Problem Statement
Data
Proposed Solution
Evalution Metrics	-0.75
Ethics & Privacy
Team expectations
Project Timeline Proposal

Rubric

	Unsatisfactory	Developing	Proficient	Excellent
Abstract	Abstract is confusing or fails to offer important details about the issue, variables, context, or methods of the project.	Abstract lacks relevance or fails to offer pertinent details about the issue, variables, context, or methods of the project.	Abstract is relevant, offering details about the research project.	Abstract is informative, succinct, and clear. It offers specific details about the educational issue, variables, context, and proposed methods of the study.
Problem Statement	Research issue remains unclear. The research purpose, questions, hypotheses, definitions or variables, and controls are still largely undefined, or when they are poorly formed, ambiguous, or not logically connected to the description of the problem. Unclear whether the research problem is quantifiable, measurable, and replicable.	Research issue is identified, but the statement is too broad or fails to establish the importance of the problem. The research purpose, questions, hypotheses, definitions or variables, and controls are poorly formed, ambiguous, or not logically connected to the description of the problem. The limited description of whether the research problem is quantifiable, measurable, and replicable.	Identifies a relevant research issue. Research questions are succinctly stated, connected to the research issue, and supported by the literature. Variables and controls have been identified and described. Clear reasoning and description on that the research problem is quantifiable, measurable, and replicable	Presents a significant research problem. Articulates clear, reasonable research questions given the purpose, design, and methods of the project. All variables and controls have been appropriately defined. Clear and significant reasoning on the quantifiability, measurability, and replicability of the research problem. All elements are mutually supportive.
Background	Did not have at least 2 reliable and relevant sources. Or relevant sources were not used in relevant ways	A key component was not connected to the research literature. Selected literature was from unreliable sources. Literary supports were vague or ambiguous.	Key research components were connected to relevant, reliable theoretical and research literature.	Narrative integrates critical and logical details from the peer-reviewed theoretical and research literature. Each key research component is grounded in the literature. Attention is given to different perspectives, threats to validity, and opinion vs. evidence.
Proposed Solution	Lacks most details; vague or interpretable in different ways. Or seems completely unrealistic and inapplicable to the project domain.	Limited descriptions of the rationales and theories behind the solution provided and on how the solution will be tested. Limited relevance to the input dataset and problem to be solved.	Sufficient details on algorithmic description or theoretical properties; clear definition of how the solution will be tested, reproduced, and on the benchmark used.	Highly clear and succinct description of the rationales and theories behind the solution; thorough and comprehensive consideration of how the solution will be applied and tested; valid approach on how to reproduce the solution and effective benchmark to test the solution; a strong connection to the problem proposed.
Data	Did not have references to relevant data sources for this problem. Did not describe the data obtained at those sources	A key data source was not referenced or described in a satisfactory level of detail	All relevant data sources were referenced and described in terms of their key variables and size	Multiple data sources for each aspect of the project, All data sources are fully described and referenced. The details of the descriptions also make it clear how they support the needs of the project.
Evaluation Metrics	Did not propose any metric for evaluating the model or very little effort in this section.	Evaluation metrics proposed with limited relevance or inappropriate metrics; ambiguous description of the metrics to be used.	Thoughtful and meaningful evaluation metrics with sufficient considerations and descriptions of the model to be evaluated.	Effective and comprehensive evaluation metrics with thorough and detailed descriptions.
Ethics	No effort or just says we have no ethical concerns	Minimal ethical section; probably just talks about data privacy and no unintended consequences discussion. Ethical concerns raised seem irrelevant.	Ethical concerns described are appropriate and described sufficiently	Ethical concerns are described clearly and succinctly. This was clearly a thoughtful and nuanced approach to the issues
Team expectations	Lack of expectations	The list of expectations feels incomplete and perfunctory	It feels like the list of expectations is complete and seems appropriate	The list clearly was the subject of a thoughtful approach and already indicates a well-working team
Timeline	Lack of timeline. Or the timeline is completely unrealistic	The timeline feels incomplete and perfunctory. The timeline feels either too fast or too slow for the progress you expect a group can make	It feels like the timeline is complete and appropriate. it can likely be completed as is in the available amount of time	The timeline was clearly the subject of a thoughtful approach and indicates that the team has a detailed plan that seems appropriate and completable in the allotted time.

Scoring: Out of 9 points

Each Developing => -0.75 pts
Each Unsatisfactory/Missing => -1.5 pts
- until the score is 0

If students address the detailed feedback in a future checkpoint, they will earn these points back.

Comments

I understand you are using these evaluation metrics because you formulate the problem as a classification problem. But treating price ranges as different classes sounds a bit weird to me? Probably I have missed it but you need to take the relationship between price ranges into consideration both during training and evaluation. For example, if the actual price of a phone is $650, predicting a price range of 700-1500 is not too wrong, but a prediction of 2000-2800 definitely should be considered as 'very wrong'. Therefore you need to rethink about whether evaluation metrics like TP/FP etc. are suitable.

xiw013 commented 1 year ago

Our project is only trying to predict the price range of the mobile without taking consideration of the actual prices of the mobile phone, so when we are training the dataset, our actual results will be integer number from 0 to 3 indicating the actual prices range from low price value to very high price value. We understand that what you are saying is that if the actual range is 0, the prediction of 1 in less wrong than the prediction of a 2 or 3. But if we change our evaluation metric to mean squared error, that sounds very weird since our output are categories (number 0, 1, 2, and 3). Do you have any suggestion on how should we formulate this?

scott-yj-yang commented 1 year ago

@eleeeysh

eleeeysh commented 1 year ago

Hi sorry for the late reply.

It's fine you define the problem as a classification problem and evaluate the models as any ordinary classification model. But since ordinal regression somehow sits between classification and numerical regression, I would suggest that in addition to ordinal accuracy (e.g precision, recall, etc) you may also consider measuring your model using metrics for a numerical target (e.g. MAE, etc.) and see (1) what if during model training we punish misclassification more if the prediction is further from the ground truth (i.e. if a sample's true label is 1, the loss is higher if it's predicted as 5 than predicted as 3)? Will MAE decrease? How will precision/recall be affected? (2) what if instead of directly fitting a classification model, we first fit a linear regression model, then convert the output to category? (e.g. predict price=$2000 ==> category=3) How would precision/recall/MAE change?

These are something I can think of to explore considering this is an ordinal regression problem; you don't have to do this and you can definitely try anything you feel that makes sense, and It's fine that in the end that you find just to treat these categories as any normal categories and to apply only the traditional classification model yields the best result. My point is just you need to show that you realize this is not a common classification problem and what are the potential issues and how you are going to address the issues.

eleeeysh commented 1 year ago

evaluation metric: +0.75

COGS118A / Group001-SP23