odpi / egeria

Egeria core
https://egeria-project.org
Apache License 2.0
795 stars 259 forks source link

Create a list of the licenses associated with components used by Egeria #1610

Open mandy-chessell opened 4 years ago

mandy-chessell commented 4 years ago

To make it easy for someone to understand the licenses associated with our complete stack, we need to include a THIRD_PARTY.md file at the top of our git repositories that list the licenses in use for each technology and a link to the software's full license.

planetf1 commented 4 years ago

Agree, This is essential

In addition to our git repository, we may also need to ensure we have the correct licenses in each maven artifact, as well as in any 'archives' we build for distribution

See also #898

I've noticed a plugin that may assist in some of this - https://www.mojohaus.org/license-maven-plugin/

cmgrote commented 4 years ago

I'm willing to take a first stab at the markdown file...

I've noticed a plugin that may assist in some of this - https://www.mojohaus.org/license-maven-plugin/

Or was the intention to use this plugin to generate the markdown file itself? (Or to use this plugin to ensure that appropriate license information is bundled with any assemblies?)

planetf1 commented 4 years ago

As much automation as possible as this is a continually moving target -- for example just when investigating spring I needed to try with replacing some java. with jakarta. - not to say that change will stick, but it's going to continue changing ....

We also need to figure out what needs documenting. For example we might not need to document the build plugins we use - or other code that is used at build time only.

We also have a variety of UI components that are references in another issue which need covering too.

the doc in the PR is a super start.. and we do have a review process for PRs where we could remind devs to update the license, but it is prone to error, so if we can do anything better that would be good.

cmgrote commented 4 years ago

Indeed automation would be good, but might be tricky for the top-level... A few things I noticed when pulling together the initial list:

Therefore, while it took some effort to pull together the initial list, I was thinking that we could simply monitor updates to the root-level pom.xml (which shouldn't have that many massive dependency changes going forward) to a) sanity-check dependencies belog, b) lookup appropriate project and license info, and c) include the appropriate references and license links in THIRD_PARTY.md.

When it comes to embedding details within the various assemblies, on the other hand, automation indeed is probably the way to go.

planetf1 commented 4 years ago

We also have our Nexus IQ scanning which scans our dependencies. This does identify dual licenses components also, and provides a vehicle for reviewing changes (ie marking the issues as being addressed). Builds are only weekly but I think I need to a) get them done more often b) Have them scan other releases (I'll take this action). This tool also has some knowledge about the wording used to indicate different licenses. I'll get these changes done and start reviewing the content (in particular for 1.1 which we need to release)

The cassandra driver will be needed (I assume) for a) discovering cassandra metadata b) as part of a janusgraph deployment backed by cassandra. They should only be used by first or even second level connectors.

First pass I would mostly go with a single list of what we use in Egeria

I will check with our ODPi legal person to confirm what licenses we need to document - my current understanding is we don't need to cover build time tools in general - but I could be wrong.

We have two related objectives

Outside the scope of this issue though

planetf1 commented 4 years ago

I will investigate using the license check plugin to create a list of licenses automatically within the build, augmented by required manual additions (ie for polymer) until a fully automated solution is found. Further, will look at publishing these as some kind of artifact that can be associated with the build in azure pipelines and posted to artifactory or similar This would also allow a consuming organisation to refer to the license information when taking a particular release, by ensuring base docs and/or release notes refer to the results/artifact cc: @cong78

planetf1 commented 4 years ago

1845 is dependent on this first step, but will take it further

planetf1 commented 4 years ago

I have added the 'maven-license-plugin' to the build.

First pass this will

The effect of the second of these is to then have the license file picked up during packaging, so artifacts published to maven central contain the license file - for example

➜  http-helper pwd
/Users/jonesn/.m2/repository/org/odpi/egeria/http-helper/1.2-SNAPSHOT

➜  1.2-SNAPSHOT jar -tf http-helper-1.2-SNAPSHOT.jar
META-INF/MANIFEST.MF
META-INF/
org/
org/odpi/
org/odpi/openmetadata/
org/odpi/openmetadata/http/
META-INF/maven/
META-INF/maven/org.odpi.egeria/
META-INF/maven/org.odpi.egeria/http-helper/
org/odpi/openmetadata/http/HttpHelper$1.class
org/odpi/openmetadata/http/HttpHelper.class
THIRD_PARTY.txt
META-INF/maven/org.odpi.egeria/http-helper/pom.xml
META-INF/maven/org.odpi.egeria/http-helper/pom.properties

The net effect is that

This was coded in a new top level profile (-Plicense) but is currently being used by default unless '-DskipLicense' is added

The format of both files:

➜  egeria git:(issue1610) ✗ cat licenses/THIRD_PARTY-full.txt | head

Lists of 466 third-party dependencies.
     (BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)
     (Unknown license) ASM Core (asm:asm:3.1 - http://asm.objectweb.org/asm/)
     (Eclipse Public License - v 1.0) (GNU Lesser General Public License) Logback Classic Module (ch.qos.logback:logback-classic:1.2.3 - http://logback.qos.ch/logback-classic)
     (Eclipse Public License - v 1.0) (GNU Lesser General Public License) Logback Core Module (ch.qos.logback:logback-core:1.2.3 - http://logback.qos.ch/logback-core)
     (The Apache Software License, Version 2.0) ZkClient (com.101tec:zkclient:0.10 - https://github.com/sgroschupf/zkclient)
     (The Apache Software License, Version 2.0) HPPC Collections (com.carrotsearch:hppc:0.7.1 - http://labs.carrotsearch.com/hppc.html/hppc)
     (Apache 2) DataStax Java Driver for Apache Cassandra - Core (com.datastax.cassandra:cassandra-driver-core:3.7.1 - https://github.com/datastax/java-driver/cassandra-driver-core)
     (Apache 2) DataStax Java driver for Apache Cassandra(R) - core (com.datastax.oss:java-driver-core:4.3.0 - https://github.com/datastax/java-driver/java-driver-core)

There are currently a handful of unknown licenses:

✗ cat licenses/THIRD_PARTY-full.txt | grep -y unknown
     (Unknown license) ASM Core (asm:asm:3.1 - http://asm.objectweb.org/asm/)
     (Unknown license) commons-beanutils (commons-beanutils:commons-beanutils:1.7.0 - no url defined)
     (Unknown license) dom4j (dom4j:dom4j:1.6.1 - http://dom4j.org)
     (Unknown license) servlet-api (javax.servlet:servlet-api:2.5 - no url defined)
     (Unknown license) jsp-api (javax.servlet.jsp:jsp-api:2.1 - no url defined)
     (Unknown license) Antlr 3 Runtime (org.antlr:antlr-runtime:3.2 - http://www.antlr.org)
     (Unknown license) zookeeper (org.apache.zookeeper:zookeeper:3.4.6 - no url defined)
     (Unknown license) Jettison (org.codehaus.jettison:jettison:1.1 - no url defined)

UI licenses will need handling separately - probably via a manual file for now. The intent would be to add a file to the same top level/licenses directory

planetf1 commented 4 years ago

I've added a first-pass into 1.2 which captures maven licenses.

During 1.3 we can look at the UI/node/js components, and hopefully get validation what we have is sufficient. Also fix up any license discrepancies ie if missing

Moving this to 1.3

planetf1 commented 4 years ago

Example of missing licenses that need to be fixed (by configuration of the scan):

14:16:42,025 [WARNING] There are 2 dependencies with no license :
14:16:42,026 [WARNING]  - dom4j--dom4j--1.6.1
14:16:42,026 [WARNING]  - org.antlr--antlr-runtime--3.2
github-actions[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

planetf1 commented 4 years ago

still valid

github-actions[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.