Open sverhoeven opened 3 years ago
If you're suggesting just checking if the file named LICENSE is present, I agree. Do we want to include any of these others?
filenames:
extensions:
That already equates to making 36 requests unless we can ask for the dir tree (looks like that's possible, e.g. https://api.bitbucket.org/2.0/repositories/jspaaks/badge-test/src). We should probably determine our preferred order of checking these, and then return after you find the first match.
Sidenote: I'm disinclined to write or maintain code that looks at the content of these files. Besides the degrees of freedom in naming them, there are many licenses, and additionally there is a lot of wiggle room in each license file text, such as additions, typos, formatting differences etc. Covering all of these is a lot of work that I think we should not do. I know that tools exist that can determine the license of a repo (e.g. licensee), but these complicate things more than what I'd be willing to accept.
Yes, other namings of LICENSE would be nice.
I dont intend to look inside either, the FAIR recommendation just wants any license.
We can reduce the requests by getting a listing of the root of a repo.
the FAIR recommendation just wants any license.
But note it recommends using one that existed already, not random text
In https://github.com/fair-software/howfairis/blob/6edfcc98ebe2520136ba55ef906217dc43d8cd19/howfairis/mixins/license_mixin.py#L19 the presence of a license is checked by calling the GitHub or GitLab API. Bitbucket does not seem to have an API endpoint for licenses.
We could check if a LICENSE file is present in the bitbucket repo.