MI-FraunhoferIWM / data2rdf

About A generic pipeline that can be used to map raw data to RDF.
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Data2RDF Next #28

Closed yoavnash closed 1 week ago

yoavnash commented 11 months ago

Goal: Allow users with basic python knowledge to generate RDF from their data.

Current bottlenecks:

Improvements:

  1. Take out EMMO from the code.
  2. HDF5 should not be pushed into DSMS as part of the pipeline.
  3. Use QUDT for representing the measurement units (instead of EMMO).
  4. Support context (the 4th column of the triplestore). See ConjunctiveGraph (Example in datasink app with the RDFLib-AllegroGraph connector) or Graph with identifier.
  5. Remove the need to provide the method graph.
  6. Support mapping format provided by form2rdf.
  7. Connection to vocabulary:
    1. Automatic suggestions
    2. Get vocabulary terms via the excel (might not on officelibra, office 365).
    3. Use DSMS: upon uploading the data file into DSMS, the data terms are extracted, and vocabulary could be matched against them. This will generate a mapping file that will be used by data2rdf.

To be considered:

  1. Support flat JSON dictionary as input format.
yoavnash commented 8 months ago

Possibly Related: Vocabulary management in DSMS drawio