Closed stephenholleran closed 8 months ago
@sdsmdp @stephenholleran
I see the following version information in the JSON example currently. Is this sufficient ? If I understand correctly, X wants to pass their data to Y, and hence generates the JSON and specifies that the JSON is consistent with V0.1 of the WRA schema ? In that case , do we need anything beyond, simply : version: '0.1.0'
?
How do we handle future breaking changes seems to be a separate but important issue as well.
"version": {
"type": "string",
"title": "Version",
"description": "The version of the IEA WRA Data Model this file is based on. The format of this follows Semantic Versioning with the year and month tagged on after. E.g. 1.1.1-2020.07",
"examples": [
"0.1.0-2020.06",
"1.1.1-2020.07"
],
"pattern": "^([0-9]{1,2})[.]([0-9]{1,2})[.]([0-9]{1,2})-([0-9]{4})[.]([0-9]{2})$"
}
@abohara , I am confused by your comment. The snippet you quoted is from the JSON schema, and so it specifies and describes the version attribute. In practice, an actual JSON file that implements the schema will have simple name-value pair as you suggest (e.g., version: "0.1.0-2020.06"
).
Is it the additional identifiers that follow the version number that you question? I believe (but we could test) that any validation tool and version comparators know how to handle (or ignore) such metadata. It will only look at <major>.<minor>.<patch>
. But perhaps that is not your concern...?
Regarding breaking changes, I agree it is important. I don't believe solving it is urgent until we propose a change that would be a breaking change. To put it another way, whatever method we come up with may be inadequate until we are in a situation where we are proposing an actual breaking change.
I suggest a set of principles we could follow regarding breaking changes. For example:
<major>
number).Hi @abohara @sdsmdp To try and help clear up @sdsmdp confusion, the code snippet:
"version": {
"type": "string",
"title": "Version",
"description": "The version of the IEA WRA Data Model this file is based on. The format of this follows Semantic Versioning with the year and month tagged on after. E.g. 1.1.1-2020.07",
"examples": [
"0.1.0-2020.06",
"1.1.1-2020.07"
],
"pattern": "^([0-9]{1,2})[.]([0-9]{1,2})[.]([0-9]{1,2})-([0-9]{4})[.]([0-9]{2})$"
}
Is in the Schema file to describe how the Schema version number is displayed in a json implementation, as in the demo implementation:
{
"author": "Stephen Holleran",
"organisation": "brightwind",
"date": "2020-06-07",
"version": "0.0.0-2020.07",
"plant_name": "A Name of Wind Farm",
"plant_type": "onshore_wind",
So to answer @abohara's question:
If I understand correctly, X wants to pass their data to Y, and hence generates the JSON and specifies that the JSON is consistent with V0.1 of the WRA schema? Is this sufficient?
Yes, I think it is sufficient for Y to know what Schema version was followed to create the json and so will be able to parse it based on that same version.
The problem is how can person Y get that particular version of the Schema? I thought of this earlier last week. I think I should have put in a "$version": "v0.1.0-2021.01
at the top of the Schema. This would be the easy thing and I should have done it.
One of the reasons I didn't consider it is because I thought once the repo is tagged and released with 'v0.1.0-2021.01' we would be able to link to any file in the repo by including the release name. This is possible through git but what I think comes back is the whole repo as a .zip as you can manually download from here: https://github.com/IEA-Task-43/digital_wra_data_standard/releases/tag/v0.1.0-2021.01 This could be unzipped in code and then have the Schema file for this version. A bit of a long way around but is possible.
Another method is that every version of a file on GitHub has a perma link. A permanent link that shows that file at that commit so everyone with that link can see the same thing. Person Y would have to have a linking table between version number and the perma link. Again, possible, but not great. I think including "$version": "v0.1.0-2021.01
in the actual Schema file to identify itself is best?
@sdsmdp
Regarding the rest of your post on principals for updating the Schema and version number I totally agree it would be good to implement. I found months ago a blog about updating JSON Schema's specifically and it looks reasonable. https://snowplowanalytics.com/blog/2014/05/13/introducing-schemaver-for-semantic-versioning-of-schemas/
@stephenholleran @sdsmdp
Sorry for the confusion, but it seems like it is cleared up. I think I mistook the originally issue to mean that the JSON schema did not have any version information. Hence I had that snippet there as evidence that maybe we could clear it.
The first two principals ( from @sdsmdp) seem very reasonable and aligned with the Sem. Ver. approach. I have also found an accompanying 'changelog' documenting 'breaking' changes to also be helpful . I have seen them attached with the tagged releases ( which is how I would trace the versioning information if I was starting out fresh).
Do you both feel we need any thing more detailed at the moment ? If not, how do you suggest documenting this to move ahead ?
Other than implementing a version at the top of the schema as @stephenholleran suggested, I don't think we need to do anything else at this time. When we actually make a change we can think through and better provide guidance on how to:
In other words, I think figuring this out can be one of the stories to complete when we make a change (esp. if it is a breaking change). We do not need to burden the initial release with figuring it out now.
I have create a new issue #47 to specifically deal with adding the version number to the schema.
This issue is about putting versions into the database so this issue will stay open.
@kersting Do you know if there is a way to convert a JSON Schema file into a PostgresSQL or other database schema.
@stephenholleran Unfortunately we couldn't find any way to convert it.
Hey @kersting, we still need to do something with regards the versioning of the PostgresSQL. At the moment whenever there is a change that effects the PostgresSQL I will keep it updated but we should formalize this with a version number.
No clear actions to take. The SQL statements are more of a nice to have. Any user using them will most likely tweak their implementation of the database and so we feel there is no real reason to maintain a version history of this. We will endeavor to keep it up to date with the current version of the json schema.
Falling out from the discussion in issue #29 about tagging the version of the schema, specifically @sdsmdp's comment, we should track the Data Model version in a new database table to help with updating existing databases to newer versions.
Resulting from this would be making sure that updates to the Data Model are matched in the create table SQL statements and writing scripts within that to update previous databases to the new schema.