Open AguilarRaul opened 1 year ago
Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
Overall, it is a very well-done and interesting project!
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Things that went well :
Suggestions for improvements :
doc
in the root folder to store the documentation like the proposal.md and the final report. The images folder including all the results from the scripts could be moved to a folder called results
, and the python notebooks could be moved to the src
folder, which will make it easier for the reader to understand and navigate through the project. Overall, it is a well-explained and very interesting project. Great work!
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Overall well done! The code is divided into sections in long scripts that is easy to follow and I have few minor sugesstions:
doc
folder, maybe you can rename the report
folder.This is an interesting topic. Good job!
This was derived from the JOSE review checklist and the ROpenSci review checklist.
The Dataset
section in the README is good and in-depth but it should not be the the first section in the proposal, and can be shortened as the README is supposed to only give a broad overview of the project. This was derived from the JOSE review checklist and the ROpenSci review checklist.
Five pieces of feedback that have been implemented:
[Peer review feedback] In the EDA script, more comments and documentation could be added so that it is easy for the reader to understand the script. Commit URL: https://github.com/UBC-MDS/horror_movies/commit/b95931e7f50162474832746202418d392fbee0b5 https://github.com/UBC-MDS/horror_movies/commit/95a0645f5373fde8ef731929bb4ed8ca21bae7b4 File Changed: src/eda_horror.R
[TA feedback] Figure captions missing Commit URL: https://github.com/UBC-MDS/horror_movies/commit/bfd30bba27fbe09b092ee469cd1ac11537376606 File Changed: src/inference_horror.R
[TA feedback] The proposal.md file has been created and moved to the doc directory -2 mechanics Commit URL: https://github.com/UBC-MDS/horror_movies/commit/b21a94378f89f94f32fc0732053b754036361c6b File Changed: doc/proposal.md
[TA feedback] Plots suffer from one or more severe problems. For instance, overplotting, missing legend, small text or no axis labels Commit URL: https://github.com/UBC-MDS/horror_movies/commit/b99dbe0c7fd5642c1db521d8155fc955bfc2f64a File Changed: notebooks/Horror_movies_attributes_and_revenue_EDA.ipynb, src/eda_horror.R
[Peer review] The final report was named EDA_keys.ipynb, which made it hard for me to understand which was the final report. The naming of the report file could be improved. (eg : {name of the project}_report.ipynb) Commit URL: https://github.com/UBC-MDS/horror_movies/commit/094d7534728707b4d908465dc50ff674dc55596f File Changed: report.ipynb
Submitting authors: J99thoms, Lorraine97, AguilarRaul, Hongjian-Sam-Li
Repository: https://github.com/UBC-MDS/horror_movies Report link: https://github.com/UBC-MDS/horror_movies/blob/main/notebooks/EDA_keys.ipynb Abstract/executive summary:
inferential research question is whether
'high'
rated horror movies have a larger median revenue than'low'
rated horror movies (among those with non-zero revenue).Considering only horror movies with non-zero revenue, let $R_h$ be the population median revenue (in USD) of horror movies with average ratings greater than the median average rating of horror movies, let $R_l$ be the population median revenue (in USD) of horror movies with average ratings no greater than the median average rating of horror movies, and let $\delta = R_h - R_l$ be the difference in population median revenues. Then our hypotheses are:
$\text{H}_0:\ \delta = 0$ and $\text{H}_a:\ \delta > 0.$
Our significance level will be the standard $\alpha = 0.05$.
Our test statistic will be the difference in sample median revenues, $\delta^* = \hat{R}_h - \hat{R}_l$.
Since we are doing inference about the median, a CLT-based approach is not applicable here. Thus we will be using the simulation-based approach for this hypothesis test. In particular, we will use a permutation test. This makes the assumption that our sample is a good representative sample of our population of interest.
Editor: @flor14 Reviewer: Roan Rain, Ritisha Sharma, Gaoxiang Wang