kubeflow / community

Information about the Kubeflow community including proposals and governance information.
Apache License 2.0
156 stars 220 forks source link

Develop a list of software licenses used by Kubeflow and its dependencies #599

Open jbottum opened 1 year ago

jbottum commented 1 year ago

We need an inventory the software license(s) used by Kubeflow and its dependencies.

Kubeflow uses an Apache License 2.0, per this page, https://github.com/kubeflow/kubeflow/blob/master/LICENSE.

Kubeflow has many dependencies. We might use LicenseFinder or DepChecker to check the dependencies and their licenses.

jbottum commented 1 year ago

Tidlift is another potential tool to review licenses and dependencies

jbottum commented 1 year ago

Potentially we need to define hat is Kubeflow, what is included? Other - Potential dependency issues with Grafana, Minio, Prometheus.

@zijianjoy @akgraner @james-jwu do you have any comments on this ? Is it required ?

james-jwu commented 1 year ago

I got some feedback from Bob. We will likely need a list of licenses like:

Regarding license discovery, my understanding is that we need to scan all software and their dependencies within the container images for Kubeflow. KFP does this regularly, and we can perhaps share some process / experience here. @zijianjoy Is this something you can provide guidance?

zijianjoy commented 1 year ago

To share KFP license scan process:

I suggest to start with Kubeflow WG component images and then potentially extend to its dependencies for license scan. https://github.com/kubeflow/manifests#kubeflow-components-versions. If this requirement comes from CNCF, I am guessing dependencies like istio and knative should have already been license compliance, because they are CNCF projects.

jbottum commented 1 year ago

@DomFleischmann Have you been able to create an inventory of the Kubeflow images ? Do we need our list to include the location where those images are stored ?

jbottum commented 1 year ago

@akgraner @juliusvonkohout do either of you have a list of Kubeflow images with their location ?

juliusvonkohout commented 1 year ago

@jbottum i provided the list multiple times in the security meeting. Although busybox and some Workbench images are missing. Dominik uploaded it here then https://pastebin.ubuntu.com/p/4nMrk4SXjm/ The tags are missing, since they will change for 1.7 anyway.

juliusvonkohout commented 1 year ago

With a security repository i would have a place to store them and people could add missing stuff /merge their own lists...

jbottum commented 1 year ago

@juliusvonkohout @DomFleischmann thanks! great work.

difince commented 1 year ago

Hi all, here is the list of images I got (for manifest v1.7-branch): kf_1.7.0_images

Below is the script used for generating the images list. The script needs to be placed in the root directory of manifest repo in order to be run.

VERSION=1.7.0
output_file="kf_${VERSION}_images"

# Try to delete 'tmp' file with force - no matter if it exists or not
rm -f tmp
# Iterate over all files with names: 'kustomization.yaml', 'kustomization.yml', 'Kustomization' found recursively in current directory
for F in $(find ./apps ./common \( -name kustomization.yaml   -o -name kustomization.yml -o -name Kustomization \)); do
  # Get path to the file
  dir=$(dirname -- "$F")
  # Generate k8s resources specified in 'dir' using 'kustomize build' command.
  # Check if the command fails and log the problmatic folder.
  kbuild=$(kustomize build "$dir")
  return_code=$?
  if [ $return_code -ne 0 ]; then
    printf 'ERROR:\t Failed \"kustomize build\" command for directory: %s. See error above\n' "$dir"
     continue
  fi
    # Grep the output of 'kustomize build' command for 'image:' and '- image' lines,
    # and remove strings 'image:', '- image: ", empty spaces and tabs from the output.
    # Lastly, delete all empty lines, and lines containing '{' character.
    # Redirect the output to 'tmp' file
  grep '\-\?\s\image:'<<<"$kbuild" | sed -re 's/\s-?\simage: *//;s/^[ \t]*//g' | sed '/^$/d;/{/d' >> tmp
done

# Sort the content of 'tmp' file, get the uniq records and redirect the output to 'output_file'
sort tmp | uniq > "$output_file"
# Clean 'tmp' file.
rm -f tmp

echo "File ${output_file} successfully created"

Disclamer: kustomize build command fails for the following directories:

@annajung brought up that there are example images that probably need to be also considered.

difince commented 1 year ago

As discussed in slack on March 15, the above script should consider also manifests ./example folder So, this line for F in $(find ./apps ./common \( -name kustomization.yaml -o -name kustomization.yml -o -name Kustomization \)); do needs to become: for F in $(find ./apps ./common ./example \( -name kustomization.yaml -o -name kustomization.yml -o -name Kustomization \)); do