Open geekygirldawn opened 1 month ago
Hey @geekygirldawn thanks for filing this! You're right that the licenseInfo
is inconsistent with the other "community standards" type of docs. We did a little digging internally and there is already a (private) API method that we could use to expose the filename, exactly like the resourcePath
field does on codeOfConduct
that you noted. Would adding that to the API be sufficient to get you going on this?
FWIW I don't expect the special case of tracking license content changes over time as a first-class API endpoint to happen; it seems like quite a niche that would have a high engineering cost. We don't in general do time-series/historical changes due to storage constraints, and, as you're proposing, with the file info it could be derived from the commit history.
Oh, and: regarding The url
field, I too find it a bit strange that it returns a link to choosealicense.com rather than the github.com URL to the file, but changing that would be considered a breaking API change 😢
We did a little digging internally and there is already a (private) API method that we could use to expose the filename, exactly like the
resourcePath
field does oncodeOfConduct
that you noted. Would adding that to the API be sufficient to get you going on this?
That would be super helpful, thank you!
FWIW I don't expect the special case of tracking license content changes over time as a first-class API endpoint to happen; it seems like quite a niche that would have a high engineering cost. We don't in general do time-series/historical changes due to storage constraints, and, as you're proposing, with the file info it could be derived from the commit history.
I didn't think so, but I thought it wouldn't hurt to ask :)
Ideally, I would love to be able to easily get data out of the GitHub API that shows when a repository has changed their license, especially when it has changed from an open source license to a non-open source one or a more restrictive license.
As @gyehuda mentioned:
Details in this discussion: https://github.com/todogroup/ospology/discussions/480
Or maybe it would be cool for GitHub to surface this another way - maybe something like https://innovationgraph.github.com/? I know Innovation Graph itself is focused on metrics broken out across various economic areas, so not exactly like that, but maybe there are some other things that companies care about (e.g., licenses, dependents info, supply chain security metrics) that could be grouped together in a way that let's people explore / analyze that data more easily?
On a related note, the way that the GraphQL API handles data from the licenseInfo object seems counter intuitive to me, partly because it returns very different things from what the API returns for other auto discovered file objects, like codeOfConduct.
Here's an example query:
And the output of the query:
From the licenseInfo object, I can't seem to get to the actual name, url, or path of the file where the license is stored in the repository. This is unlike the codeOfConduct object, which returns the url / resourcePath, which lets me programmatically determine where I can find the file within the repository.
If I could derive the location / name of the license file in the repo via licenseInfo (or some other method), I could use it as the input into another query to get details about the commits for the file. In the below example, I hardcoded the name of the license file after manually looking it up in the repo, but ideally, I could get this from the GitHub API and pass it in as a variable into a query that would give me commit details.
Maybe there is another way to do this that I just haven't found?
cc: @ahpook