OpenEnergyPlatform / oeplatform

Repository for the code of the Open Energy Platform (OEP) website. The OEP provides an interface to the Open Energy Family
http://openenergyplatform.org/
GNU Affero General Public License v3.0
62 stars 19 forks source link

Improving the implementation of OEM with regard to data models with multiple tables / improve datapackage support #735

Open jh-RLI opened 3 years ago

jh-RLI commented 3 years ago

In the course of the development of the oedatamodel and also e.g. in the current review of the oemof_b3 dataset it becomes apparent that there is no good solution on how to append the same metadata to several related tables (as far as i know). This results, among other things, in the problem that, for example, C(R)UD methods are difficult to use from the user's point of view. In this case, each table must be handled individually in order to distribute a change. To illustrate: A user wants to upload a dataset that is spread over several tables (e.g. 15). After uploading, metadata is appended to the tables. Now a new source is to be documented in the metadata. The user knows the OEP and uses the web interface to edit the metadata. Unfortunately he has to update the metadata for each of the tables individually :(

From a developer's point of view, this may not be a problem, as it is always easy to replace the metadata "quick and dirty" on all tables concerned. However, my main concern here is usability.

I have created an overview image for the use case (a possible solution IMO). I think more work will be needed to implement and display data models that include multiple tables on the OEP website.

CRUD_md_on_multiple_tables_same_dataset

Ludee commented 3 years ago

This means metadata will not be stored as a comment anymore but will be moved to a central "metadata table". This will be a major rebuilt of the metadata structure and will cause many changes in different modules!?

jh-RLI commented 3 years ago

This could be the case. But it is also possible to keep the metadata as comment on table and update it when changes occur. The table with the metadata serves here only as a central resource in which changes are detected and then distributed (as comment on table).

christian-rli commented 3 years ago

Is this issue still open? Reading through this I started wondering about something that might only partially be related. Concerning the simultaneous update to several tables, I think one main hindrance is that we don't allow for partial updates. If there was such a functionality it would be easy to implement an online option for this as well. Right now, one always needs to account for resource definition, which I believe tends to mess up the workflow. The easiest solutions I can think of are:

  1. Update the API to allow partial updates to the metadata
  2. Create a function in OMI to allow for partial updates

Metadata are a comment on a table, so updating partially could work along the following steps:

  1. download the existing string
  2. use function that detects differences between existing string and new string / json with partial update
  3. the same or another function creates a resulting updated string with combined information
  4. upload to the same table

The function(s) could be part of OMI or the API (maybe both?), not sure which would be best, but I would somehow hope that this won't result in yet another tool :)

jh-RLI commented 2 years ago

1059 a new method for saving and reading oemeta data to and from the database is introduced. This solves part of this issue. The next step is the implementation of a data collection store consisting of several tables (which are technically not connected, but belong to the same area/project). In addition, tables created as relational data models (tables that are technically related) should also be stored in the collection. Both collections should be visualized on the OEP. If we want to follow the idea presented here, we would also have to implement some functions to make data collections smoothly compatible with frictionless data packages. It is therefore necessary to generate a datapackage.json file from the metadata, which describes all resources of a datapackage.