Open ltalirz opened 2 years ago
In principle, all metadata can evolve over time - we've already encountered the case for time-dependent query strings https://github.com/ltalirz/atomistic-software/issues/114#issuecomment-1097488391
At the same time, recording time-dependent metadata will remain an exception and be introduced only where necessary (most metadata won't change over time).
This could be modeled by a schema like:
"code-name": {
"query_string": "val",
"license": "val",
"updates": {
"2020": {
"license": "val-new",
"query_string": "val-new"
}
}
}
where the value for 2021 would be obtained by recursively updating the top-level dictionary of the code with changes from all relevant years from the "evolution" key.
This probably means we should then pre-build the "rendered" version of the code metadata for each year. This is not a big deal though - the current file has 40KB of text, so we'll be adding < 40KB*12 of memory.
Since the citation data for year X is retrospective, the changes for year X should reflect the metadata at the beginning of year X.
There have been a number of license changes since the list started in 2021: at least molcas, castep, amber, gromos, cpmd. I suspect a few more might have changed from 2010-2021.
Edit: DIRAC switched to LGPL
Since we only record the latest license, we currently misrepresent the license trends at the ecosystem level (in particular, free/open licenses are actually growing more strongly than the current graphs would show).
It would be very nice to fix this.
For an accurate historical analysis of license models, one should record the license of the individual codes as a function of time, rather than just stating the license that a code has today.