OpenEnergyPlatform / oemetadata

Repository for the Open Energy Family metadata. Contains metadata templates, examples and schemas. For metadata conversion see https://github.com/OpenEnergyPlatform/omi
https://openenergyplatform.github.io/oemetadata/
MIT License
21 stars 3 forks source link

Working example: datapackage from datahub.io #51

Open jh-RLI opened 3 years ago

jh-RLI commented 3 years ago

I have been looking at other working examples of frictionless data package integrations. I reviewed the "gdp_zip.zip" data package from datahub.io, which also uses the frictionless data convention. This issue is just about documenting that example.

I think they have some fields in the datapackage.json that we might miss, like "rowcount" or "bytes" to indicate the size of the data.

christian-rli commented 2 years ago

"bytes" would depend on how it's stored so I'd be wary of an implementation.

"rowcount" could be useful, but it would have to be generated automatically on the platform. The same is true for "type" in a table's column description and the resource's "path".

Where is a good placce to take care of this? Should OMI provide the necessary functions and the OEP will make use of them? Should it become part of the API or is it rather a script on the OEP itself that should take care of this?

jh-RLI commented 2 years ago

2 Use cases (User provides data, Data is automatically generated from data tables): This information does not necessarily have to be provided by the user if the data is already stored in a database table on the OEP. In this case, the information on data type, data size and row count can simply be generated automatically. If this information comes from external data sources, it would have to be provided by the user. The user would fill some new fields in the metadata.

How to implement IMO: As omi validates the metadata we would extend it to validate the new fields. Furthermore, we would need functionality to get the information about the dataset automatically. This would be handled by the API (e.g. get type of column) and some function that inserts the information in the metadata string (not sure if OMI or OEP function).

jh-RLI commented 2 years ago

I inclued an example datapackage in the oedatamodel repository. Parts of the oemetadata is inclued in the datapackage.json file.
It was not possible to use the full oemetadata string because frictionless datapackage cannot validate all fields. Maybe this problem will be solved if the schema.json is used for validation. As soon as I have time I will test this out.

jh-RLI commented 2 years ago

@Ludee I think currently it is best to take the datapackage from the example above. The example still needs to be created for version 1.5.1. As it is now, however, the datapackage can be uploaded to the OEP using the oedatamodel_api. Perhaps adaptations must be made due to the new oem keys.

Ludee commented 2 years ago

The example table for the new version is already on the OEP: https://openenergy-platform.org/dataedit/view/model_draft/oep_metadata_table_example_v151

I added a new column to show the OEO feature. Metadata works on the example.

jh-RLI commented 2 years ago

Plese see this datapackage example shows how the oedatamodel is used as frictionless datapackage. The datapackage.json is different from the full oemetadata string.