cc-archive / cc-link-checker

Automated link checker for legalcode and license URLs
MIT License
9 stars 13 forks source link

Add external consistency (pull URIs from /licenses/index.rdf) #29

Closed mzeinstra closed 4 years ago

mzeinstra commented 5 years ago

Happy to see this tool emerge in the repo. Good that we are starting to take the 404s on the licenses seriously. If I read the tool correctly this is now a test of whether the files on the repo are also the files on the site. However if a file is missing in the repo than this tool would give a false positive.

I would suggest adding http://creativecommons.org/licenses/index.rdf as an additional source of URIs to check against.

Formally that RDF maintains all URLs (deprecated and current) of what CC in terms of legal tools offers. It includes the public domain tools that live under a different root /publicdomain instead of /licenses.

TimidRobot commented 4 years ago

We have found that the license legalcode files are currently the best source of truth.

index.rdf URIs are also now scanned: