w3c-cg / dataspaces

Other
5 stars 0 forks source link

Data discovery #2

Open pietercolpaert opened 1 month ago

pietercolpaert commented 1 month ago

Challenge Description

When

you somehow need to find interoperable, relevant and trustworthy datasets. Today, this is a manual task. Automating this task requires a discovery mechanism, which on the Web today is an unsolved problem.

Example cases:

  1. Setting up a new route planner.
  2. Moving digital twin software from one city to another.
  3. Creating a dashboard of a certain indicator, adding more data when it becomes available.

Impact and Importance

Automating data discovery should reduce the costs for:

  1. setting up a new project
  2. bringing the project into another context
  3. maintaining the project over time

Desired Solution

  1. A language to express the criteria for a dataset to enter your project, based on: the shape or schema used (e.g., SHACL), the provenance (e.g., only datasets that originate from X or Y), geo-temporal extent, usage conditions, etc.
  2. A data model for Web-based storage system or data catalog so that the criteria can be evaluated.
  3. An algorithm to evaluate 1 over 2

Acceptance Criteria

  1. A specification is available of the language with examples on how to express datasets relevant to your application
  2. A data model specification is available
  3. A reference implementation of the algorithm can be tested

References and Resources

TallTed commented 1 month ago

SeeAlso: Describing Linked Datasets with the VoID Vocabulary