YajunHuang2 / Tut

0 stars 0 forks source link

Tut-peer-review #1

Open YajunHuang2 opened 5 months ago

siru1366 commented 5 months ago

per-review

author: "Yajun Huang" format: html

This is a simple code review.
Firstly, we need to check out the design of overall the design of the code.

#### Preamble ####
# Purpose: Read in data from the 2022 Australian Election and make
# a graph of the number of seats each party won.
# Author: Yajun Huang
# Email: yajun.huang@utoronto.ca
# Date: 14 January 2024
# Prerequisites: Know where to get Australian elections data.

At first, writing some comments in R is necessary, these codes are a good example. We need to write the purpose of the document, and the author with an email to contact. Date is also important.

#### Workspace setup ####
install.packages("tidyverse")
install.packages("janitor")

Next, it's essential to configure the workspace. This entails the installation and loading of required packages. While a package requires installation only once per computer, it must be loaded each time it is utilized. Specifically, for this scenario, we will utilize the tidyverse and janitor packages. Since this is their initial usage, installation is necessary, followed by loading each package as needed.

#### Acquire ####
toronto_shelters <- 
  # Each package is associated with a unique id found in the "For 
  # Developers" tab of the relevant page from Open Data Toronto
  # [Open Data Toronto - Shelter Occupancy](https://open.toronto.ca/dataset/daily-shelter-overnight-service-occupancy-capacity/)
  list_package_resources("21c83b32-d5a8-4106-a54f-010dbe49f6f2") |>
  # Within that package, we are interested in the 2021 dataset
  filter(name == 
    "daily-shelter-overnight-service-occupancy-capacity-2021.csv") |>
  # Having reduced the dataset to one row we can get the resource
  get_resource()

write_csv(
  x = toronto_shelters,
  file = "toronto_shelters.csv"
)

head(toronto_shelters)

This is very clear for readers to know where the code comes from. The code is well-structured and appears to accomplish the task of acquiring data from Open Data Toronto related to daily shelter overnight service occupancy capacity for 2021. Using understanding english, the comment is helpful for readers to handle the complex codes.
It's good that you included a URL link to the Open Data Toronto dataset. However, you might want to place it within a comment to make it more explicit.

toronto_shelters_clean <-
  clean_names(toronto_shelters) |>
  mutate(occupancy_date = ymd(occupancy_date)) |> 
  select(occupancy_date, occupied_beds)

head(toronto_shelters_clean)

This is the most important part of the code to gain the data and form a clear data set.
The clean_names() function is used to clean the column names, which is a good practice for consistency. The use of ymd() from the lubridate package to convert the "occupancy_date" to a date format is appropriate. However, consider adding comments to explain the purpose of each step or any specific considerations. This can be helpful for someone not familiar with R reviewing the code.