Closed Stephen-Gates closed 7 years ago
will source new licences from https://github.com/okfn/opendefinition/tree/gh-pages/licenses not sure if that source has values for all columns in csv does scrape.py do this?
Entirely possible that opendefinition.org/licenses/ has changed so that scrape.py
won't work.
FWIW the approach I was going to take, or investigate whether it'd work, would be to make a Jekyll collection of licenses on opendefinition.org so that it would be machine-readable at the source. See https://github.com/github/choosealicense.com/tree/gh-pages/_licenses for an example of that concept, even with licenses as the topic. :smile:
As discussed I can do some basic things (like fix a CSV) but not code (unless we're talking COBOL and Mainframes 25 years ago). So if the scraper is broken, I'm eyeballing the changes so I'll close this PR.
I was trying to progress this to support another project so it could draw from a list of authoritative open licences to make Frictionless Data using Data Packages that require an open licence according to the specification.
If this list of open licences is not going to be maintained, then I'll create a (smaller) drop down list in my App of "preferred" open licences. I'll avoid external dependencies that way.
@Stephen-Gates i don't think we need to scrape in the first instance - we can just maintain the CSV list here. (Even if there is some duplication with the opendefinition list atm - ultimately we obviously want one place that is authoritative).
@mlinksva amazing work on the opendefinition licenses directory.
Summary: I think we want this repo (for now) as a simple data package that people can reuse in other projects. That means it wants to be lightweight, submodulable and standalone ...
This explains why we wouldn't want to use opendefinition as it has a bunch of other stuff in it.
Does that clarify things?
@Stephen-Gates we therefore do want to to maintain this repo - but it shouldn't need (much) automation and we don't need to scrape - we can just update the licenses.csv.
@rufuspollock sure. The scripts categorised the licences into groups. The OD web site shows other groups. Do you want columns for that?
I'd start with what we have here for now and think about the main od website differences later once this first piece of work is done.
restarting work on this
@mlinksva @rufuspollock this is ready for review.
Very tempted to delete the family
column in licenses.csv
- thoughts?
Edit: forgot we chatted about that in #54
Looks like an improvement overall.
Is the CSV the master and the JSON generated? I assume deploy.py is run after PR accepted and not as part of PR?
Just skimming deploy.py
, CSV is unused. The script generates jsonp. Probably worth running and adding a commit to this with any updates generated.
Another suggested change, can OGL-UK-2.0 be marked as superseded by OGL-UK-3.0?
I guess it can be marked as superseded, as is OGL-UK-1.0. There's no mechanism to denote by what, is there?
Not that I'm aware of for OGL-UK-2.0, being superseded by OGL-UK-3.0 is obvious. It is less clear that GeoGratis was superseded by OGL-Canada-2.0.
Perhaps a bit of a re-think is needed as there is some good information captured by http://opendefinition.org/licenses/ and http://opendefinition.org/licenses/nonconformant/ that isn't captured in the csv or json. E.g.
@mlinksva reverted to WIP.
By adding status of "superseded" to OGL-UK-2.0, I should have got an error in GoodTables.io due to the enum constraint.
"superseded" is in the csv for two entries, one on purpose, one by mistake.
This lead me to what is the correct spelling of "superceded" <- GitHub's autocorrect is telling me, "not this".
Are you happy to leave is as "superceded"?
I'll explore the GoodTables.io issue and then correct the schema/data
Makes sense to use the more common spelling, unless something that we can't change is depending on supercede.
GoodTables issue raised at https://discuss.okfn.org/t/launching-goodtables-io-tell-us-what-you-think/5165/35
After reading Leigh Dodds post, The state of open licensing, 2017 edition I wonder if the following licences should be added to licenses.csv as not reviewed
?
@mlinksva @rufuspollock
@roll has fixed the schema validation on GoodTables.io (see Job History).
This PR is now good to go unless you'd like the open licences mentioned above included.
I would still like Continuous Data Integration set up as suggested.
After reading Leigh Dodds post, The state of open licensing, 2017 edition I wonder if the following licences should be added to licenses.csv as not reviewed?
Yes, but that is a separate PR / issue so let's do it after this gets merged.
@mlinksva is this good to go in your opininon? If so let's get it merged 👍 😄
And huge well done and thanks to @Stephen-Gates for his awesome contribution here. 💯 🥇
@mlinksva 👍 👏
@Stephen-Gates 👏 👏 🥇 💯 🎱
Started work on updating licenses.csv.
Added Community files
Separated out (unmaintained) Changelog from Readme
Other questions: