UBC-MDS / forest-fire-area-prediction

This project aims to predict the burned area of forest fires in the northeast region of Portugal, using meteorological and soil moisture data.
https://ubc-mds.github.io/forest-fire-area-prediction/reports/forest_fire_analysis_report.html
MIT License
9 stars 10 forks source link

Milestone 1 Review #48

Open Ivyqiuhan opened 2 years ago

Ivyqiuhan commented 2 years ago

Nice job! I provide here some comments and your grades for the first milestone. Please address these concerns in your third milestone submission.

Also try to close the issue that you already finished : )

  1. Draft a Team work contract: Correctness

  2. Project set-up: Mechanics

    • Fixing typos
  3. Project proposal: reasoning

  1. A script that downloads the data: Accuracy

  2. A script that downloads the data: Quality

  3. Exploratory data analysis in a literate code document: QUALITY

  1. Exploratory data analysis in a literate code document: VIZ
  1. Exploratory data analysis in a literate code document: REASONING
  1. Exploratory data analysis in a literate code document: ACCURACY
  2. Expectations: Mechanics
voremargot commented 2 years ago
  1. We will address this
  2. We have changed this in the proposal to be the correct ratio. Thanks!
  3. In terms of discussing plots in the proposal, paragraph 3 discusses several of the plots and finding from them. If more needs to be added, please clarify what we are missing.
  4. As we are working with a regression problem and not a classification problem, we do not need to be concerned with class imbalance. We do talk about how we are address the skewness of the data which is more relevant to our research question.
  5. We will add in a statement in the proposal but this was thoroughly addressed in the EDA document.
  6. While we thought about doing a classification model determining if there were forest fires or not, we have opted to do a regression model as stated in the proposal.
  7. In the proposal document we will address the packages we are using in the proposal. We will also mention our choice to not use feature selection in our case.
  8. While in the original proposal we did mention a confusion matrix, this was an error as we are not doing a classification problem. We did mention that we are using RMSE for our scoring metric so we will be sure to make this clearer.
  9. In the EDA we observed some months didn't have observations so we wanted to create features to address this. I have expert knowledge in forest fires and know that seasons are a good feature to add thus we were confident in our feature engineering.
  10. We plan to add in this plot as it will easily show the reader how the model performed. We plan to show the error metrics but feel that the results will be more impactful if we show it in a plot.
  11. The document titles are from a workflow described in "Art of Data Science" which is used in this course. We wanted to follow these guidelines which is why we have chosen the section titles. This is cited in the EDA document. (https://leanpub.com/artofdatascience)
  12. All figures have captions. Please clarify where the confusion is.
  13. We feel that the EDA document contains conclusions for many of our plots. Under each plot is an explanation of what we see as well as how we will address the findings. Please specify how we need to make more detailed conclusions.
Ivyqiuhan commented 2 years ago

good work, here's my comment: