frictionlessdata / frictionless-r

R package to read and write Frictionless Data Packages
https://docs.ropensci.org/frictionless/
Other
28 stars 10 forks source link

Support new field type `list` #179

Open peterdesmet opened 3 months ago

peterdesmet commented 3 months ago

CHANGELOG: https://datapackage.org/overview/changelog/#list-field-type-new

khusmann commented 1 month ago

I'm interested in this field type for representing multiselect items (although it will have to wait until the list field type can be extended by the categories property)

In the meantime, I'd vote for the latter approach (convert the cell to a vector using delimiter and load as list-columns).

For example, the csv:

row_id, field1
1, "a,b,c"
2, "d,e"
3, "f"

with schema fields:

[
  {
    "name": "row_id",
    "type": "integer"
  },
  {
    "name": "field1",
    "type": "list",
    "delimiter": ",",
    "itemType": "string"
  }
]

would become:

library(tidyverse)
tibble(
  row_id = 1:3,
  field1 = list(
    c("a", "b", "c"),
    c("d", "e"),
    c("f")
  )
)
#> # A tibble: 3 × 2
#>   row_id field1   
#>    <int> <list>   
#> 1      1 <chr [3]>
#> 2      2 <chr [2]>
#> 3      3 <chr [1]>

Created on 2024-06-07 with reprex v2.0.2