Open areleu opened 7 months ago
Hi @areleu,
In general, I don't see much reason not to use the date, but in practice this is more metadata, which is why it should be stored in the oemetadata. If there is no other way, then use only year-month, otherwise I would strongly recommend a version number.
In the metadata, you can use the title field to change the table name that is visible in the OEP. In most cases, it is better to provide this information there. It will be displaed in this format: --> readable table name from title (technical table name)
I'm not sure if your comment about multiple tables has anything to do with it, but: it is also possible to upload multiple tables (including tables containing relationships), but this is more complicated and they are not visibly grouped. So far the tags should then be used so that these tables can be found together.
Otherwise, the metadata looks good. One thing I would have to double check:
You have always deleted the "valueReference": []
in the resources if you have not specified one there. I'm not sure if this causes an error when displaying the metadata for these fields.
In general, I don't see much reason not to use the date, but in practice this is more metadata, which is why it should be stored in the oemetadata. If there is no other way, then use only year-month, otherwise I would strongly recommend a version number.
The date is in the publicationDate field as well. I would use version number but since this is a redistribution, I think is important to stick to their "versioning" pattern. Otherwise is just confusing. Unfortunately the primary source is not versioned, once a new version comes out the previous version is deleted. And I don't really know how many versions were out there, do I assign this version 1? but what would it mean? if someone tries to track this number to the original website they will be looking for the wrong thing. When I put the date, at least they know this is based on the "Ladesäuleregister released on ZZ.XX.YYYY" file which they will probably not find because they are not archived.
I guess a compromise would be to name the version as the date itself, but arent we missing then a version field on the header? We could add that and also add a versioning feature to the OEP. That way I could give the table a generic name and different versions could be associated to the same table, or if that is not done in the OEP itself we could also think about doing some kind of Zenodo integration.
BTW, is there any plan to give DOIs to the metadata? I think this would make the OEP more or less "FAIR complete". If the OEP does not do this, we could then really think about this Zenodo integration.
I'm not sure if your comment about multiple tables has anything to do with it, but: it is also possible to upload multiple tables (including tables containing relationships), but this is more complicated and they are not visibly grouped. So far the tags should then be used so that these tables can be found together.
I mean a relational model, each table on their own is not very useful, the frictionless spec allows a data package to have multiple resources associated to each other with foreign keys. Frictionless also alows to build a relational database out of these multi-resource metadata descriptors. Unfortunately the OEP is not yet adapted to fully exploit this feature but I think it is very necessary to have since you rarely find datasets consisting of single tables, and if you do they are not good structured (see example of this dataset, it has the same fields 4 times but with numbers).
Otherwise, the metadata looks good. One thing I would have to double check: You have always deleted the
"valueReference": []
in the resources if you have not specified one there. I'm not sure if this causes an error when displaying the metadata for these fields.
So the metadata is already in the model_draft section and it seems that is raising some error but is not completely breaking the page
I understand what you are pointing out, and in this case I think it makes sense to include the date until there is a better solution. It's still a compromise, but you're right that it should be obvious which source this data table refers to. Still, it doesn't seem that important since the source is also included in the metadata. But since I like pragmatic solutions, I won't be restrictive here.
I guess a compromise would be to name the version as the date itself, but arent we missing then a version field on the header? We could add that and also add a versioning feature to the OEP.
This feature has been planned for a long time and was available in some form a few years ago, but was then dropped because the developer left the team. The versioning is running in the background (at least that's what I was told, I never had the time to check if this is actually the case). The problem is the same as always: this feature is currently not part of the research projects we are working on .... . We are currently working on including such features in upcoming projects.
BTW, is there any plan to give DOIs to the metadata? I think this would make the OEP more or less "FAIR complete". If the OEP does not do this, we could then really think about this Zenodo integration.
Yes, this is also planned, and I remember that we agreed on Zenodo integration. What is holding this back is the same problem I described above ... currently not part of the research projects.
I mean a relational model, each table on their own is not very useful, the frictionless spec allows a data package to have multiple resources associated to each other with foreign keys.
It is true that it is currently not possible to provide the metadata for each resource (table) (in a single oemetadata json string). This is something we are also working on (see this issue). In general, I also agree that the oep does not fully support multi-table models, but my point is that in general the Resources field in the oemetadata can already be used to specify and create (by using oem2orm software) a relational model (see this example that specifies the oedatamodel).
So the metadata is already in the model_draft section and it seems that is raising some error but is not completely breaking the page
Ah, thanks for checking, I was expecting that :) But as you can guess, the metadata viewer widget will also be reworked so that it can handle missing fields. Hopefully it won't take too long as this is also not part of a research project but shouldn't be too much effort and will have to happen in the course of the future oemetadata updates anyway ... .
Regarding the review: With this metadata you can get the "Platinum" badge. We miss propper documentation about the bagdes. In short: you can get 'platinum' with this metadata because you have also provided annotations.
Which topic should this data be moved to? https://openenergy-platform.org/dataedit/schemas
As we are still in the transition phase to move this review process to the oep platform, it would make sense to carry out this review again on the oeplatform as soon as the last bugs in the OpenPeerReview functions have been fixed.
Which topic should this data be moved to? openenergy-platform.org/dataedit/schemas
Tricky, maybe economy
?
economy: Data related to economic activities. Examples: sectoral value added, sectoral inputs and outputs, GDP, prices of commodities etc.
It is infrastructure data but not energy infrastructure data per se so does not belong to grid
.
grid: Energy transmission infrastructure. examples: power lines, substation, pipelines.
But if I put my energy modelling shoes I would say demand
, but it is not quite demand data either.
demand: Data on demand. Demand can relate to commodities but also to services.
@fabmio any suggestions?
edit: It is infrastructure data, so if we expand the scope of grid
to infrastructure
this could be easily decided.
I agree that grid includes infrastructure. I will move the table and let you know. 👍
We are currently implementing a publishing process for the OEP as we want to avoid the review on Github. We will try to migrate the currently published data as good as possible, but it would be great to have some users to test the process and give feedback. It will take some time to implement, but there will be a slimmed down version of the feature available soon. You will be able to use it via the profile page in the OEP. Are you okay with waiting for this?
We are currently implementing a publishing process for the OEP as we want to avoid the review on Github. We will try to migrate the currently published data as good as possible, but it would be great to have some users to test the process and give feedback. It will take some time to implement, but there will be a slimmed down version of the feature available soon. You will be able to use it via the profile page in the OEP. Are you okay with waiting for this?
Sure I can help testing it.
Hi @areleu, now on the profile pages of the oep website, you can use the publish button :)
You may have already noticed the changes. Here's a quick summary of how to access them:
On https://openenergyplatform.org/ navigate to the profile page and click the tables section:
browse your tables. There are two sections Published & Draft. This view is paginated so you may need to search for your table. I realise that a search function would be helpful and it will be implemented.
once you have found the table, check to see if the licence check has been completed. If so, you should be able to click the "Publish" button and select the "Infrastructure" theme.
if the licence check has failed, you must ensure that your licence name (from the "oemetadata licences" field) matches the spdx licence list.
Issue description
This is a cleaned up and annotated version of the Ladesäuleregister of the BNetzA
The code to do this is made available here: https://github.com/areleu/fair-charging-station-data
A poster asociated to this cleanup was presented in the RDA 21st plenary during the IDW2023
I am publishing the release of last month, this data is updated like 3 times a year according to what I have noticed until now. I know that in the OEP naming conventions dates are discouraged. But I still think that the most proper way of redistributing this data is by associating it to the date of publication. @chrwm suggested that I add a column for publication date but I think this would cause 2 problems:
There is a version of the dataset which is also Normalised, which saves a lot of space and is significatnly more manageable but it requieres multiple tables, and I don't know if this feature is already available.
Workflow checklist
GitHub
OEP
Start a Review
Reviewer section
Metadata and data for review
Here are the links to my data and metadata. Naming follows the pattern model_draft.project_nameofdata: Metadata: https://github.com/OpenEnergyPlatform/data-preprocessing/blob/review/bnetza_charging_stations_01_07_2023%23112/data-review/bnetza_charging_stations_01_07_2023.json Data: https://openenergy-platform.org/dataedit/view/model_draft/bnetza_charging_stations_01_07_2023
Reviewed and published metadata and data
Final naming and location of the data and metadata after the review are as follows: schema.tablename