scott-yj-yang commented 1 year ago

Project Proposal Feedback

Score (out of 9)

Score = 9

Feedback:

	Quality	Reasons
Abstract
Background
Problem Statement
Data
Proposed Solution
Evalution Metrics
Ethics & Privacy
Team expectations
Project Timeline Proposal

Rubric

	Unsatisfactory	Developing	Proficient	Excellent
Abstract	Abstract is confusing or fails to offer important details about the issue, variables, context, or methods of the project.	Abstract lacks relevance or fails to offer pertinent details about the issue, variables, context, or methods of the project.	Abstract is relevant, offering details about the research project.	Abstract is informative, succinct, and clear. It offers specific details about the educational issue, variables, context, and proposed methods of the study.
Problem Statement	Research issue remains unclear. The research purpose, questions, hypotheses, definitions or variables, and controls are still largely undefined, or when they are poorly formed, ambiguous, or not logically connected to the description of the problem. Unclear whether the research problem is quantifiable, measurable, and replicable.	Research issue is identified, but the statement is too broad or fails to establish the importance of the problem. The research purpose, questions, hypotheses, definitions or variables, and controls are poorly formed, ambiguous, or not logically connected to the description of the problem. The limited description of whether the research problem is quantifiable, measurable, and replicable.	Identifies a relevant research issue. Research questions are succinctly stated, connected to the research issue, and supported by the literature. Variables and controls have been identified and described. Clear reasoning and description on that the research problem is quantifiable, measurable, and replicable	Presents a significant research problem. Articulates clear, reasonable research questions given the purpose, design, and methods of the project. All variables and controls have been appropriately defined. Clear and significant reasoning on the quantifiability, measurability, and replicability of the research problem. All elements are mutually supportive.
Background	Did not have at least 2 reliable and relevant sources. Or relevant sources were not used in relevant ways	A key component was not connected to the research literature. Selected literature was from unreliable sources. Literary supports were vague or ambiguous.	Key research components were connected to relevant, reliable theoretical and research literature.	Narrative integrates critical and logical details from the peer-reviewed theoretical and research literature. Each key research component is grounded in the literature. Attention is given to different perspectives, threats to validity, and opinion vs. evidence.
Proposed Solution	Lacks most details; vague or interpretable in different ways. Or seems completely unrealistic and inapplicable to the project domain.	Limited descriptions of the rationales and theories behind the solution provided and on how the solution will be tested. Limited relevance to the input dataset and problem to be solved.	Sufficient details on algorithmic description or theoretical properties; clear definition of how the solution will be tested, reproduced, and on the benchmark used.	Highly clear and succinct description of the rationales and theories behind the solution; thorough and comprehensive consideration of how the solution will be applied and tested; valid approach on how to reproduce the solution and effective benchmark to test the solution; a strong connection to the problem proposed.
Data	Did not have references to relevant data sources for this problem. Did not describe the data obtained at those sources	A key data source was not referenced or described in a satisfactory level of detail	All relevant data sources were referenced and described in terms of their key variables and size	Multiple data sources for each aspect of the project, All data sources are fully described and referenced. The details of the descriptions also make it clear how they support the needs of the project.
Evaluation Metrics	Did not propose any metric for evaluating the model or very little effort in this section.	Evaluation metrics proposed with limited relevance or inappropriate metrics; ambiguous description of the metrics to be used.	Thoughtful and meaningful evaluation metrics with sufficient considerations and descriptions of the model to be evaluated.	Effective and comprehensive evaluation metrics with thorough and detailed descriptions.
Ethics	No effort or just says we have no ethical concerns	Minimal ethical section; probably just talks about data privacy and no unintended consequences discussion. Ethical concerns raised seem irrelevant.	Ethical concerns described are appropriate and described sufficiently	Ethical concerns are described clearly and succinctly. This was clearly a thoughtful and nuanced approach to the issues
Team expectations	Lack of expectations	The list of expectations feels incomplete and perfunctory	It feels like the list of expectations is complete and seems appropriate	The list clearly was the subject of a thoughtful approach and already indicates a well-working team
Timeline	Lack of timeline. Or the timeline is completely unrealistic	The timeline feels incomplete and perfunctory. The timeline feels either too fast or too slow for the progress you expect a group can make	It feels like the timeline is complete and appropriate. it can likely be completed as is in the available amount of time	The timeline was clearly the subject of a thoughtful approach and indicates that the team has a detailed plan that seems appropriate and completable in the allotted time.

Scoring: Out of 9 points

Each Developing => -0.75 pts
Each Unsatisfactory/Missing => -1.5 pts
- until the score is 0

If students address the detailed feedback in a future checkpoint, they will earn these points back.

Comments

Everything looks great to me. For the model selection part, you could also consider random forest regression or SVM, which could provide more insight into the data. Also, as you mentioned that you would split up the data, consider making a whole model for all the data after you transform everything into numerical variables (if too much can also consider abandoning the variable that takes too much space to one hot).

ashesh8500 commented 1 year ago

Hi, can you help explain how we can implement random forest regressors on a continuos set of variables?

Dongze-Li commented 1 year ago

If you are using sk-learn, it should be as simple as rf = RandomForestRegressor(random_state=42) rf.fit(X_train, y_train)

COGS118A / Group020-SP23