LLNL / scraper

Python library for getting metadata from source code hosting tools
MIT License
49 stars 23 forks source link

Redundant License Information (GitHub) #38

Open LRWeber opened 5 years ago

LRWeber commented 5 years ago

It looks like some amount of license information is manually maintained and provided by scraper/github/util.py.

Complete license listings are readily available from GitHub and seem to make this script redundant.

via the REST API https://api.github.com/licenses

or via the GraphQL API

{
  licenses() {
    spdxId
    name
    url
  }
}

Perhaps we could re-write scraper/code_gov/models.py and any other relevant code such that we wouldn't have to maintain these hard-coded mappings?

(Suggested in #37 )