frictionlessdata / frictionless-r

R package to read and write Frictionless Data Packages
https://docs.ropensci.org/frictionless/
Other
27 stars 11 forks source link

not required column and un-ordered columns in `add_resource()` #254

Closed Rafnuss closed 2 weeks ago

Rafnuss commented 3 weeks ago

I'm not expecting an error message when adding a resource which has a column missing which is not required according to the schema.

Here [https://raw.githubusercontent.com/Rafnuss/GeoLocator-DP/main/measurements-table-schema.json] does not require the valid column, which is missing in my data.

library(frictionless)
create_package() |>
  add_resource(
    "measurements",
    data.frame(
      "tag_id" = "18LY",
      "sensor" = "pressure", 
      "datetime" = "2020-05-01", 
      "value" = 12
    ),
    schema = jsonlite::read_json("https://raw.githubusercontent.com/Rafnuss/GeoLocator-DP/main/measurements-table-schema.json"))
#> Error in `check_schema()`:
#> ! Field names in `schema` must match column names in `data`.
#> ℹ Field names: "tag_id", "sensor", "datetime", "value", and "valid".
#> ℹ Column names: "tag_id", "sensor", "datetime", and "value".

Also I am not sure why providing in the same order than in the schema is necessary. Is it no possible to re-order the data according to schema?

library(frictionless)
create_package() |>
  add_resource(
    "measurements",
    data.frame(
      "tag_id" = "18LY",
      "sensor" = "pressure", 
      "datetime" = "2020-05-01", 
      "valid" = F,
      "value" = 12
    ),
    schema = jsonlite::read_json("https://raw.githubusercontent.com/Rafnuss/GeoLocator-DP/main/measurements-table-schema.json"))
#> Error in `check_schema()`:
#> ! Field names in `schema` must match column names in `data`.
#> ℹ Field names: "tag_id", "sensor", "datetime", "value", and "valid".
#> ℹ Column names: "tag_id", "sensor", "datetime", "valid", and "value".
Rafnuss commented 3 weeks ago

Actually reading more on this , I realised this dependant on fieldsMatch. Maybe a more complex solution is required?

peterdesmet commented 2 weeks ago

Hi @Rafnuss, you (and many others, including me) want optional and reordered fields.

This feature that is not supported in Data Package 1.0, which is the version that frictionless currently implements. So right now, you (annoyingly) need to add all columns in your data, even if those are empty. Or you will need to do some preprocessing on your schema before adding it to your resource.

The feature has indeed been added as fieldsMatch in Data Package 2.0. Frictionless currently doesn't support 2.0 yet, but we aim to do so (including fieldMatch). Fully supporting v2 is a daunting task though, so it won't be soon.