nodeshift-archived / license-reporter

license-reporter is a tool that gathers licenses for project's dependencies and produces a output in XML, JSON, YAML and HTML format.
Apache License 2.0
13 stars 10 forks source link

[Suggestion] A Diff tool for license text #189

Closed lgriffin closed 6 years ago

lgriffin commented 6 years ago

While integrating this into the RainCatcher community a thought struck me. A diff style functionality while running the tool multiple times would be very welcome to prevent two potential security flaws:

  1. To check if a license type has actually changed

This is self explanatory, a change in license type should be a red flag for a developer to review the dependency. The tool will obviously flag if a dependency is not allowed however the new License type might be allowed but it should still be flagged for a review to figure out why the License has changed

  1. A change in the local license text

While this would be a bit harder to implement, we already parse and uniquely name the local license (which is great!). The local license is the License that appears word for word in the dependency. That means the owner of the dependency could change the text to infer different meaning, but keep the high level license name which our tool would pick up and be satisfied that it is a good license. The actual wording might be more malicious and open a security vulnerability.

TL:DR A diff style comparison to flag any changes in license type or local license text, at a dependency level, for multiple iterations on the license.xml in automated tooling would really set this tool apart from others.

grdryn commented 6 years ago

For our FeedHenry projects, we're going to commit all the files (including local copies of all the licenses), and we run license-reporter to update those any time we run npm shrinkwrap (i.e. any time we change a dependency). This way, we can see changes directly in Git, and don't have to remember to run a hypothetical future license-reporter diff command.

I'm not sure if that would cover what you want here or not, just thought it worth mentioning! :)

Edit: see here for an example: https://github.com/feedhenry/fh-mbaas/pull/83

helio-frota commented 6 years ago

@grdryn This seems to be a awesome approach.

If we need to try diff from license-reporter:

  1. user need to know/remember how it works. for instance: license-reporter diff or something else and how it behaves.
  2. it will be by default more slow since git is written in c.
  3. I don't think it will add more complexity because after the refactor we have new concepts like: commands and modules in separated files. So now is more hard to break the other existing features.
  4. we need to use 3rd dependencies like https://github.com/kpdecker/jsdiff or another.

@lgriffin WDYT ?

@grdryn thanks for the feedback

lgriffin commented 6 years ago

I think it's hugely valuable to know if the local license text changes to something malicious (or unintended). The tool may pick it up as license A but the text might be license B or it might have words that do not involve license A added to it. So I'm happy if we default that ability to visualising it via Git or via a diff feature in the tool. For RHMAP, as we will check in locally, Git would suffice. For the community and other developers something within the tool would be cool. that would definitely be outside of the 1.0 scope in my view.

helio-frota commented 6 years ago

I was testing and thinking about it and I noticed it's more complicated than I thought.

legend: Text Diff --> diff between the text inside license files Type Diff --> diff between the attribute name of the XML file:

<licenseSummary>
    <project>szero</project>
    <version>1.0.0</version>
    <license>Apache-2.0</license>
    <dependencies>
        <dependency>
            <packageName>node-builtins</packageName>
            <version>0.1.1</version>
            <licenses>
                <license>
                    <name>Apache License 2.0</name>   < ------------------- this
helio-frota commented 6 years ago

I don't know if makes sense to try to call git from the license-reporter, because it will be a source- code-change anyway, and in this case we need the SCM.

helio-frota commented 6 years ago

I think we can close due this: #279