Closed sergey-alekseev closed 5 years ago
They do the same thing (@jonabc will correct me if that's not the case) but the use I've seen is like the example implies: ignored for internal dependencies, reviewed for external dependencies that aren't automatically allowed per the allowed list and license detection.
Ideally/wishlist I suppose they could have some different additional tooling, eg check whether a dependency can be moved out of approved and tests still pass, guardrails to detect whether something has changed that might merit a new review, or to detect whether a dependency really is internal.
@sergey-alekseev @mlinksva
reviewed
is meant for dependencies that need to be cached and checked, but do not have a license found that matches the allowed
configured licenses.
ignored
is meant for dependencies that shouldn't be cached at all, and will not have cached metadata written to the repo.
Good question! I'll update the docs when I have a chance, though I'd also welcome a PR :smile:
Thanks for the clarification! @jonabc
I copy-pasted your comments and made a PR to add the docs change to your to-do list.
An interesting thing though. Chances are an external dependency doesn't have a license info (examples: https://github.com/iain/http_accept_language/issues/72, https://github.com/grosser/i18n_data/pull/47, https://github.com/grosser/sort_alphabetical/pull/23). If added to reviewed
then licensed status
fails with "missing license text". Hence they should be added to ignored
. So I'm not sure whether "but do not have a license found that matches the allowed
configured licenses." suits well for reviewed
🤔
@sergey-alekseev what you described is intentional, though I'm not sure what the recommended remediation here is. @mlinksva any thoughts?
The goal is to have cached metadata for all external dependencies that are used by the shipped product. The ignored
flag is to say that a dependency that's being found isn't shipped as part of the public product, or it's something built by the project owner and isn't an external dependency. An external dependency that's found but doesn't have license text shouldn't be set as ignored
I'd add (and recall having done this) some evidence for the reviewed status in the case no license text at all is extracted. In the case of the first example above, "Released under the MIT license" pulled from https://github.com/iain/http_accept_language/blob/74a6a244afa069154d08df0f5df373a6c7fc4fe2/README.md#with-bundler
Note https://github.com/github/licensed/issues/189 (depending on specific implementation) could make this situation less common: in the case of the first example above again, https://github.com/iain/http_accept_language/blob/74a6a244afa069154d08df0f5df373a6c7fc4fe2/http_accept_language.gemspec#L13 would be recorded by licensed along with any license texts found.
Aside: thank you @sergey-alekseev for making PRs (seen from your examples, also thanks for providing concrete examples!) to improve licensing documentation from projects where it is missing, silently saving time for developers using (as they should) license tooling (licensed or numerous others) who might otherwise need to ask for review of innocuous dependencies.
@mlinksva I probably didn't put enough emphasis on "missing license text". To make sure we are on the same page first let me clarify. licensed cache
fetches licenses well for those examples. What's missing in these examples is license text (e.g. from a LICENSE
file). See below:
$ licensed status
Checking cached dependency records for project
..........F.F....................F...................
Errors:
* project.bundler.http_accept_language
filename: project/.licenses/bundler/http_accept_language.dep.yml
- missing license text
* project.bundler.i18n_data
filename: project/.licenses/bundler/i18n_data.dep.yml
- missing license text
* project.bundler.sort_alphabetical
filename: project/.licenses/bundler/sort_alphabetical.dep.yml
- missing license text
N dependencies checked, 3 errors found.
project/.licenses/bundler/http_accept_language.dep.yml
:
---
name: http_accept_language
version: 2.1.1
type: bundler
summary: Find out which locale the user preferes by reading the languages they specified
in their browser
homepage: https://github.com/iain/http_accept_language
license: mit
licenses: []
notices: []
project/.licenses/bundler/i18n_data.dep.yml
:
---
name: i18n_data
version: 0.7.0
type: bundler
summary: country/language names and 2-letter-code pairs, in 85 languages
homepage: https://github.com/grosser/i18n_data
license: mit
licenses: []
notices: []
project/.licenses/bundler/sort_alphabetical.dep.yml
:
---
name: sort_alphabetical
version: 1.1.0
type: bundler
summary: Sort UTF8 Strings alphabetical via Enumerable extension
homepage: https://github.com/grosser/sort_alphabetical
license: mit
licenses: []
notices: []
project/.licensed.yml
:
sources:
bundler: true
allowed:
- mit
reviewed:
bundler:
- http_accept_language # no license file (see https://github.com/iain/http_accept_language/issues/72)
- i18n_data # no license file (see https://github.com/grosser/i18n_data/pull/47)
- sort_alphabetical # no license file (see https://github.com/grosser/sort_alphabetical/pull/23)
So if I understand correctly your proposal is to change project/.licenses/bundler/http_accept_language.dep.yml
to the following:
name: http_accept_language
version: 2.1.1
type: bundler
summary: Find out which locale the user preferes by reading the languages they specified
in their browser
homepage: https://github.com/iain/http_accept_language
license: mit
licenses:
- sources: README.md
text: Released under the MIT license
notices: []
Am I correct?
Also I can fill in recently added license text for project/.licenses/bundler/i18n_data.dep.yml
and project/.licenses/bundler/sort_alphabetical.dep.yml
even though they didn't have that text at that version. Would that be fine in your view?
Aside: there is still a room for Licensed improvement. For my project I had to change 36 .licenses/bundler/*.dep.yml
files manually our of 171 dependencies (21%). 23 times I changed only license: other
, 11 times only licenses: []
and 2 times both. Filling in some details manually is quite easy and quick, so I didn't want to spend more time exploring the source code to understand why it happens. However one interesting thing I noticed is Licensee fetching license
well from a git source, but not locally:
2.5.5 :001 > require 'licensee'
# => true
2.5.5 :002 > license = Licensee.license '.../gems/awesome_print-1.8.0'
# => #<Licensee::License key=other>
2.5.5 :003 > Licensee.license 'https://github.com/awesome-print/awesome_print/tree/v1.8.0'
# => #<Licensee::License key=mit>
An external dependency that's found but doesn't have license text shouldn't be set as
ignored
Does this mean that even though an external dependency has e.g. the MIT license but doesn't have license text you can't use it? At least licensed status
will not succeed. I'd agree with this comment, but you know licensing requirements, so can you please clarify? @jonabc
I did a quick search to see whether you had any comments on that during the development, but looks like the requirement was added during the initial release of v1.0.0 in the first commit.
Going to answer as many of these as I can 😄
So if I understand correctly your proposal is to change project/.licenses/bundler/http_accept_language.dep.yml to the following:
Any content licensee detects from a README
is already included in cached output, so I don't think it's that.
What would change in this example is that we'd be showing the license specification picked up from http_accept_language.gempsec. Still working through the best way to show that, but as a possible example
name: http_accept_language
version: 2.1.1
type: bundler
summary: Find out which locale the user preferes by reading the languages they specified
in their browser
homepage: https://github.com/iain/http_accept_language
license: mit
licenses:
- sources: http_accept_language.gemspec
text: s.license = "MIT"
notices: []
If my understanding is correct, this is a slippery slope for the reasons outline by licensee. If others with better understanding of OSS licensing requirements give a 👍 that representing this content as a license is ok I'm happy to make that change.
For my project I had to change 36
.licenses/bundler/*.dep.yml
files manually our of 171 dependencies (21%). 23 times I changed onlylicense: other
, 11 times onlylicenses: []
and 2 times both
Yeah that's painful. Some feedback around the details of why you had so many adjustments needed would be great
license: other
: Is the license text far enough off from the standard template for the license that a false negative is found? Are multiple licenses found? Something else?licenses: []
: Do all 11 of these not have any LICENSE
type file and nothing in the README?That said, licensed won't overwrite the license
field on dependency version updates if the (normalized) license contents are determined to be the same between the cached and new version. In our experience there is high likelihood that nothing changes re: license contents between versions of a dependency, so a majority of the pain in clarifying licenses is only experienced once.
However one interesting thing I noticed is Licensee fetching license well from a git source, but not locally
That's a better question to ask at Licensee directly, though in my experience I've found this has to do with gem not installing all the files that Licensee uses from the repo as part of the package.
Does this mean that even though an external dependency has e.g. the MIT license but doesn't have license text you can't use it? At least licensed status will not succeed. I'd agree with this comment, but you know licensing requirements, so can you please clarify?
This is the part where I get to say I'm not a lawyer and don't personally have an answer on what constitutes a binding agreement when using an OSS license. I don't know @grosser
personally and couldn't tell you whether his comment is based in legal knowledge or personal opinion.
As I mentioned before, licensee does bring this up in case that helps?
I did a quick search to see whether you had any comments on that during the development, but looks like the requirement was added during the initial release of v1.0.0 in the first commit.
The requirement is older than my involvement in the product 😆 I asked awhile ago whether reviewed
dependencies should still surface this error, and the answer was to keep the error in place. I don't recall whether the question+answer was pre-v1 release or not, but I couldn't find anything in the issue/PR repo history.
Thanks for your answer @jonabc
Any content licensee detects from a README is already included in cached output, so I don't think it's that.
I believe my previous assumption was wrong if I understand your comment correctly. Then I'm not sure how to make licensed status
green in this case. Probably with sources: http_accept_language.gemspec
as you outlined. Can you please rephrase and clarify, so I can understand better what you mean?
Some feedback around the details of why you had so many adjustments needed would be great
Some examples added to https://github.com/sergey-alekseev/licensed-example. All of them can be fixed manually I believe. Some previous failures were fixed by https://github.com/github/licensed/pull/188 and no longer require manual entry.
so a majority of the pain in clarifying licenses is only experienced once.
Yes. So not really painful.
I believe my previous assumption was wrong if I understand your comment correctly. Then I'm not sure how to make licensed status green in this case. Probably with sources: http_accept_language.gemspec as you outlined. Can you please rephrase and clarify, so I can understand better what you mean?
That sounds like the best option for now as the most straightforward fix. Hopefully as we figure out https://github.com/github/licensed/issues/189, much of this will just fix itself automatically. If you have any specific preference beyond "just make it work please", please do to comment on the issue 🙇 😄
Some examples added to https://github.com/sergey-alekseev/licensed-example
I forked your example 👍 I don't have enough time to dig in at the moment but I'd love to look through this and see exactly why each other
classification occurred. It'd be great if we could better support some of these scenarios
Awesome. Hopefully https://github.com/github/licensed/issues/189 will solve the problem.
@sergey-alekseev thanks for your patience!
I just completed a quick audit of the example you provided. Here's my findings for things without license contents, and those marked other
This class of failures is what we're trying to address in https://github.com/github/licensed/issues/189.
The list here shows each dependency that was classified as "other", which files licensed
found for each dependency, and the resulting classification from licensee
for each file. In the case of other
classifications for a file I also added a note on what I was able to tell on why a license wasn't detected
The majority of the failures here were due to deviations from an official license template for various reasons. I think some of them might be actionable to fix in the license detection logic in licensee, but many are likely to continue to be painful.
It's getting a little late in my day, but I'll open an issue tomorrow over in licensee and link to this + the example repo and see if there are any thoughts on improvements.
The MIT License (MIT)
as part of the wordset because it's after the copyright noticedependency.readme_file.content_normalized == ""
jQuery UI as well as this gem are licensed under the MIT license (see jquery-ui/MIT-LICENSE.txt).
LICENSE
above The MIT License
OR COPYRIGHT HOLDERS
from license text is the only difference I can see2 clause bsd
, where one of bsd 2-clause
or bsd-2-clause
was expected@jonabc thanks for improving Licensed! Hope it'll be getting better and better with detecting licenses.
Just wanted to clarify how I fixed the first category of gems. Here is what I filled in manually for the gems with sources:
aws-eventstream, has license file aws-partitions, has license file aws-sdk-core, has license file aws-sdk-kms, has license file aws-sdk-s3, has license file aws-sigv4, has license file bundler, has license file libv8, has license in README paper_trail, has license file rails-settings-cached, has license file
So all of them have license files actually.
@sergey-alekseev I agree with you that the remote repositories have license files and/or licenses in the readme, but those files aren't included in the distributed gem - you can check each gemspec spec.files
property to see what is included.
The only special case here is bundler
, which I've found to kind've do its own thing and may or may not be directly installed with ruby. For example my local bundler installation doesn't have a README or LICENSE
➜ licensed-example git:(master) ✗ ls -l ~/.rbenv/versions/2.6.4/lib/ruby/gems/2.6.0/gems/bundler-1.17.2/
total 0
drwxr-xr-x 4 jonabc staff 128 Oct 9 10:28 exe
➜ licensed-example git:(master) ✗ ls -l ~/.rbenv/versions/2.6.4/lib/ruby/2.6.0/bundler/
build_metadata.rb deployment.rb gem_helpers.rb lockfile_generator.rb ruby_dsl.rb source.rb vendored_persistent.rb
capistrano.rb deprecate.rb gem_remote_fetcher.rb lockfile_parser.rb ruby_version.rb source_list.rb vendored_thor.rb
cli/ dsl.rb gem_tasks.rb match_platform.rb rubygems_ext.rb spec_set.rb version.rb
cli.rb endpoint_specification.rb gem_version_promoter.rb mirror.rb rubygems_gem_installer.rb ssl_certs/ version_ranges.rb
compact_index_client/ env.rb gemdeps.rb plugin/ rubygems_integration.rb stub_specification.rb vlad.rb
compact_index_client.rb environment_preserver.rb graph.rb plugin.rb runtime.rb templates/ worker.rb
compatibility_guard.rb errors.rb index.rb process_lock.rb settings/ ui/ yaml_serializer.rb
constants.rb feature_flag.rb injector.rb psyched_yaml.rb settings.rb ui.rb
current_ruby.rb fetcher/ inline.rb remote_specification.rb setup.rb uri_credentials_filter.rb
definition.rb fetcher.rb installer/ resolver/ shared_helpers.rb vendor/
dep_proxy.rb friendly_errors.rb installer.rb resolver.rb similarity_detector.rb vendored_fileutils.rb
dependency.rb gem_helper.rb lazy_specification.rb retry.rb source/ vendored_molinillo.rb
I've updated https://github.com/jonabc/licensed-example/ to include the full content after bundle install --path vendor/gems
. This is the "source of truth" that github/licensed
uses to determine dependency data.
It's mostly complete but won't contain data on the ruby default gems like bundler and json
Makes sense, I haven't thought about it. So Licensed can't fetch those license files from local directories as they simply do not exist there. They can still be grabbed from repositories and filled in manually then when necessary.
@mlinksva @benbalter I've pared down https://github.com/jonabc/licensed-example/tree/master/vendor/gems/ruby/2.6.0/gems to only include the dependencies in the audit above.
I'm not expecting anything to be actionable for dependencies on the no license contents
list, but for the others it's not immediately clear which cases should be detected or not. If anyone gets a chance could you take a look and see if there is anything in there you'd expect that licensee would already detect, or would be reasonable to detect with additional work?
🙇
Recently I got a question from my teammate:
I found only the following:
For me it's still not really clear when I should put a gem into
ignored:
and when intoreviewed:
. Can you guide me?Probably this should be clarified in the docs or combined into a single category to avoid confusions and save time debating with colleagues.