@makka3 and @Jamienie , you have chosen an interesting question and I look forward to reading more on your analysis. I always feel uneasy when locking my bike somewhere in Vancouver and I am excited to see your results! Here are some improvement points and minor suggestions for your project.
Reasoning
I think your explanatory variable is a bit unclear in your analysis. Is it the month as a categorical variable with 12 levels, the season as a categorical variable with 4 levels or a binary variable as in "summer" and "not summer"? You should make clear what exactly you are comparing.
You are comparing the difference in means of different groups. There are several assumptions that you make when you conduct such analysis. Do not forget to mention and investigate these assumptions. If the assumptions do not hold, try to think of alternative methods that can solve these. (This comment is especially for your future work in the analysis rather than the proposal)
You should give more thought to your analysis plan statements. What do you mean by computing a test statistic that corresponds to the null hypothesis? How will you generate simulated data? What will you do in the end with the p-value? What is the significance level?
Keep in mind that your readme is the landing page for someone examining your analysis. Therefore, you can do some additions that would make the readme more informative and intriguing for a visitor. In particular, you can
include a brief introduction about the context and your motivation to do such analysis.
explain the data a bit more in detail, state the variables included, or print the head of the data table in the main readme. Including the link to the data source is a good idea, but still explaining the data in your repository is needed.
Mechanics
I see you have communicated through issues which is great! A suggestion is to change the way you title the issues. For example, rather than "V0.2 release is created" in the title, write "Create V0.2 release" and close the issue once it is created, i.e. the issue is solved. I see that some past solved issues are still open in your repo.
There is a statement in your proposal which reads as "Imported data can be found here, in the src folder of our repository, and a snippet of it can be found here." The imported data should be in the data folder, not the src folder. Your folder is also named script, so either use script or src. The link you have put in this sentence leads to the script that imports the data, not to the imported data. Therefore, you might want to change both the wording of the sentence and the link.
Minor Suggestions:
I would recommend taking "DSCI_522" out of the name of your project. But if you like it this way, no problem.
Include you Github profile links to the Team Members section on your readme.
You might want to follow the repo structure as outlined here by Tiffany. Every repo might be different a bit, but I believe it is better to follow a standard structure as you'll see in most data science projects.
I hope this feedback is helpful in improving your project. Please let me know if you have any questions. Good luck!
@makka3 and @Jamienie , you have chosen an interesting question and I look forward to reading more on your analysis. I always feel uneasy when locking my bike somewhere in Vancouver and I am excited to see your results! Here are some improvement points and minor suggestions for your project.
Reasoning
I think your explanatory variable is a bit unclear in your analysis. Is it the month as a categorical variable with 12 levels, the season as a categorical variable with 4 levels or a binary variable as in "summer" and "not summer"? You should make clear what exactly you are comparing.
You are comparing the difference in means of different groups. There are several assumptions that you make when you conduct such analysis. Do not forget to mention and investigate these assumptions. If the assumptions do not hold, try to think of alternative methods that can solve these. (This comment is especially for your future work in the analysis rather than the proposal)
You should give more thought to your analysis plan statements. What do you mean by computing a test statistic that corresponds to the null hypothesis? How will you generate simulated data? What will you do in the end with the p-value? What is the significance level?
Keep in mind that your readme is the landing page for someone examining your analysis. Therefore, you can do some additions that would make the readme more informative and intriguing for a visitor. In particular, you can
Mechanics
I see you have communicated through issues which is great! A suggestion is to change the way you title the issues. For example, rather than "V0.2 release is created" in the title, write "Create V0.2 release" and close the issue once it is created, i.e. the issue is solved. I see that some past solved issues are still open in your repo.
There is a statement in your proposal which reads as "Imported data can be found here, in the src folder of our repository, and a snippet of it can be found here." The imported data should be in the
data
folder, not thesrc
folder. Your folder is also namedscript
, so either usescript
orsrc
. The link you have put in this sentence leads to the script that imports the data, not to the imported data. Therefore, you might want to change both the wording of the sentence and the link.Minor Suggestions:
I would recommend taking "DSCI_522" out of the name of your project. But if you like it this way, no problem.
Include you Github profile links to the Team Members section on your readme.
You might want to follow the repo structure as outlined here by Tiffany. Every repo might be different a bit, but I believe it is better to follow a standard structure as you'll see in most data science projects.
I hope this feedback is helpful in improving your project. Please let me know if you have any questions. Good luck!