cole-brokamp / fr

Implement Frictionless Standards in R
https://cole-brokamp.github.io/fr/
Other
1 stars 1 forks source link

getting started vignette #5

Closed cole-brokamp closed 1 year ago

cole-brokamp commented 1 year ago

Frictionless Tabular-Data-Resource

Convert a data frame into a frictionless tabular-data-resource (i.e., fr_tdr object) with as_fr_tdr(). Here, we create some metadata based on ?mtcars:

d_fr <-
  mtcars |>
  tibble::as_tibble() |>
  as_fr_tdr(name = "mtcars",
            version = "0.9.1",
            title = "Motor Trend Car Road Tests",
            homepage = "https://rdrr.io/r/datasets/mtcars.html",
            description = "The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).")

Print the fr_tdr object to view all of the table-specific metadata descriptors and the underlying data frame:

d_fr

Use str() or pillar::glimpse() to view all field- and table-specific descriptors:

pillar::glimpse(d_fr)
str(d_fr, max.level = 1)

Use fr_desc() to get a list of descriptors, excluding the list of fields:

fr_desc(d_fr)

Add a metadata descriptor for one of the fields in the tabular data resource by using the @ or S7::prop accessor functions from the {S7} package:

d_fr@fields$cyl@title <- "Number of cylinders"
S7::prop(d_fr@fields$gear, "title") <- "Number of forward gears"

d_fr@fields[c("cyl", "gear")]

## d_fr[c("cyl", "gear")]

Add a name metadata descriptor for each field in the fr_tdr object

# TODO

the_tdr <- c(fr_desc(d_fr), list(fields = lapply(d_fr@fields, fr_desc)))

str(the_tdr)

Using str() provides a useful overview of the structure of a fr_tdr object, including all field-specific metadata, table-specific metadata, and the underlying @value data vector:

str(d_fr)

Use fr_schema() to extract the metadata for each field in a list. Pair this with listviewer for an interactive list viewer:

fr_schema(d_fr) |>
    listviewer::jsonedit(mode = "view")

Accessor functions work as they do with data frames and tibbles, but return a fr_field or td_tdr object:

d_fr[["disp"]] |> class()

d_fr$disp |> class()

d_fr["disp"] |> class()

fr_field objects can be used mostly anywhere that the underlying data frame can be used.

lm(mpg ~ cyl + disp + wt, data = d_fr)

Explicitly drop the Frictionless attributes and extract just the underlying data frame with as_data_frame() or as_tibble():

tibble::as_tibble(d_fr)

summary(d_fr)

summary(as_data_frame(d_fr))

An fr_tdr object is essentially a list of fr_fields with table-specific metadata descriptors.

Create a list of fr_fields using fr_field() and use it to create a fr_tdr object:

# TODO add example for list approach