OpenEnergyPlatform / oemetadata

Repository for the Open Energy Family metadata. Contains metadata templates, examples and schemas. For metadata conversion see https://github.com/OpenEnergyPlatform/omi
https://openenergyplatform.github.io/oemetadata/
MIT License
21 stars 3 forks source link

Revision of the OEMetadata structure to improve usage and technical correctness #97

Closed jh-RLI closed 1 month ago

jh-RLI commented 1 year ago

@wingechr We talked about this in the project meeting last week. Could you please add a brief sketch of what you mean so we can discuss it and make it available to others?

I think if you could quickly change the structure of the Oemetada example to what you explained to me, that would help me and others understand it - I thought I understood, but I forgot what key you were talking about :) I think it was about resources or schema?

--> https://github.com/OpenEnergyPlatform/oemetadata/blob/develop/metadata/latest/example.json

wingechr commented 1 year ago

OLD

PROPOSED NEW

{
    "name": "<BUNDLE_NAME>",
    "resources": [
        {
            "name": "<TABLE_NAME_1>",
            "path": "<URL>",
            "schema": {...},
            <METADATA>    
        },
        {
            "name": "<TABLE_NAME_2>",
            "path": "<URL>",
            "schema": {...},
            <METADATA>    
        },
        ...
    ]
}
henhuy commented 1 year ago

Agree with your proposal!

wingechr commented 1 year ago

sorry, one more correction: in the bundled datapackage.json that presumably is part of a zipped datapackage in a download, the resource path must be changed to the now local csv file that contains the table data. There can also be some more additional metadata values (like csv encoding) that don't make sense for data in the database, but for it's physical representation in a file when downloaded:

{
    "name": "<BUNDLE_NAME>",
    "resources": [
        {
            "name": "<TABLE_NAME_1>",
            "path": "data/<PATH/TO/CSV.csv>,
            "encoding": "utf-8",
            "schema": {...},
            <METADATA>    
        },
        {
            "name": "<TABLE_NAME_2>",
            "path": "data/<PATH/TO/CSV.csv>,
            "encoding": "utf-8",
            "schema": {...},
            <METADATA>    
        },
        ...
    ]
}
jh-RLI commented 1 year ago

I added an example presenting what my understanding of the changes is.

One more thing, we have to figure out where to put the keys that have been stored in the old resources.

"profile": "tabular-data-resource",
"name": "model_draft.oep_metadata_table_example_v152",
"path": "http://openenergyplatform.org/dataedit/view/model_draft/oep_metadata_table_example_v152",
"format": "PostgreSQL",
"encoding": "UTF-8", 

Other tasks after this one is completed.

wingechr commented 1 year ago

just to be clear: the metadata is different when it is on the platform and when it is in a downloaded zip package.

most notably:

as an example: i could upload a csv file with encoding iso-8556-1, but in the postgres database it will always be utf-8, so the encoding information becomes irrelevant. if i download it again though, the encoding of the csv file (once created) becomes relevant for the user again.

jh-RLI commented 1 year ago

True, we should add functionality to the OEP metadata upload and download that removes and adds this information? Then the main change to the current oemetadata spec. would just be the removed resources key.

wingechr commented 1 year ago

yea. i think like this: