fair-software / howfairis

Command line tool to analyze a GitHub or GitLab repository's compliance with the fair-software.eu recommendations
https://pypi.org/project/howfairis/
Apache License 2.0
58 stars 23 forks source link

LICENSE file check for bitbucket #183

Open sverhoeven opened 3 years ago

sverhoeven commented 3 years ago

In https://github.com/fair-software/howfairis/blob/6edfcc98ebe2520136ba55ef906217dc43d8cd19/howfairis/mixins/license_mixin.py#L19 the presence of a license is checked by calling the GitHub or GitLab API. Bitbucket does not seem to have an API endpoint for licenses.

We could check if a LICENSE file is present in the bitbucket repo.

jspaaks commented 3 years ago

If you're suggesting just checking if the file named LICENSE is present, I agree. Do we want to include any of these others?

filenames:

  1. license
  2. License
  3. LICENSE
  4. licence
  5. Licence
  6. LICENCE
  7. copying
  8. Copying
  9. COPYING

extensions:

  1. without extension
  2. .txt
  3. .md
  4. .rst

That already equates to making 36 requests unless we can ask for the dir tree (looks like that's possible, e.g. https://api.bitbucket.org/2.0/repositories/jspaaks/badge-test/src). We should probably determine our preferred order of checking these, and then return after you find the first match.

Sidenote: I'm disinclined to write or maintain code that looks at the content of these files. Besides the degrees of freedom in naming them, there are many licenses, and additionally there is a lot of wiggle room in each license file text, such as additions, typos, formatting differences etc. Covering all of these is a lot of work that I think we should not do. I know that tools exist that can determine the license of a repo (e.g. licensee), but these complicate things more than what I'd be willing to accept.

sverhoeven commented 3 years ago

Yes, other namings of LICENSE would be nice.

I dont intend to look inside either, the FAIR recommendation just wants any license.

We can reduce the requests by getting a listing of the root of a repo.

jspaaks commented 3 years ago

the FAIR recommendation just wants any license.

But note it recommends using one that existed already, not random text