Dr-Eberle-Zentrum / Data-projects-with-R-and-GitHub

6 stars 3 forks source link

Feedback for DrMohamedElsherif by Joschka #256

Closed Joschka8878 closed 3 days ago

Joschka8878 commented 3 months ago

Hi @DrMohamedElsherif,

First of all, love the project idea!

As for my suggestions:

Let me know if you have any questions!

Sincerely, Joschka

DrMohamedElsherif commented 3 months ago

Rebuttal

First of all, Thank you for your encouraging words and insightful suggestions for my project.

Addressing your concerns step by step: Response to Points 1: The dataset is neither found on the website nor online. It was a part of a Data science project management project of a past course and has no available cookbook. The dataset set is composed of 4 excel sheets that together represent all information needed to analyze and correlate the data: a) a Bike count sheet: This has the count of bike over each hour for each day starting from January 2018 and till January 2024 across different channels (stations along each of the three paths where counter devices are installed) indicated by channel_id for each of the three counter sites, indicated by counter_id. b) a Counter Site sheet that has the names and the ids of the three counter sites; c) a Weather sheet: That has weather information including temperature, wind, humidity, rain, etc. for each day of the years 2011 and 2012. d) Finally the Holidays sheet: That shows the federal and state holidays for years 2018 and 2024. This sheet is important for correlating the bike traffic across sites with regular workdays versus holidays. Nonetheless, I will include the head of the tables in my markdown.

Response to Points 1-2: By the end the project, there shall be clear idea about: 1- The traffic load across each counter site (possible visualization using bar charts or scatter plots) 2- The temporal variation of bike traffic across the three sites. (possible visualization using bar charts or scatter plots) 3- Correlation between bike traffic and a) Hours of the day; b) Days of the week; c) Seasons of the year; and location of the counter site. (possible visualization using heat maps). To achieve those goals, joining between tables would be required to link the weather data for each day to the bike count data for each day for example. To filter holidays, the bike count data needed to be joined with the holidays data first, and so on. Another issue is the missing data, which is left to the researcher to decide which method is the best the deal with this problem; a skill to learn by doing this project.

I hope project idea and goals are now clearer. For further concerns, please do not hesitate to provide further valuable comments.