cloudfoundry / java-buildpack

Cloud Foundry buildpack for running Java applications
Apache License 2.0
437 stars 2.58k forks source link

Add tags/labels/annotations to droplet based on detected framework #777

Closed robachmann closed 4 years ago

robachmann commented 4 years ago

From a provider's perspective, it would be valuable to have an inventory of deployed applications and some insights in the used versions.

For the Java buildpack, it would be nice to see more than just the JRE version, namely the used Spring (Boot) version to implement governance policies and support security screenings.

The CF API supports with the newly introduced droplet-object (/v3/droplets/:guid) some fields that could hold such information:

{
  "guid": "585bc3c1-3743-497d-88b0-403ad6b56d16",
  "state": "STAGED",
  "error": null,
  "lifecycle": {
    "type": "buildpack",
    "data": {}
  },
  "execution_metadata": "",
  "process_types": {
    "rake": "bundle exec rake",
    "web": "bundle exec rackup config.ru -p $PORT"
  },
  "checksum": {
    "type": "sha256",
    "value": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
  },
  "buildpacks": [
    {
      "name": "ruby_buildpack",
      "detect_output": "ruby 1.6.14",
      "version": "1.1.1.",
      "buildpack_name": "ruby"
    }
  ],
  ...
  "metadata": {
    "labels": {},
    "annotations": {}
  }
}

Either within buildpacks: "detect_framework" : "Spring Boot 2.2.4" or as metadata labels or annotations.

What's your opinion on this, @nebhale ? Thanks and best regards

nebhale commented 4 years ago

I think this is a great idea but is very dependent on implementation. We've long put out a bunch of information around what the buildpack contributes, but it's been truncated to 255 characters because the database within CAPI had a column restriction to this size. Because the vast majority of the information we already collect (probably about 80% of it) is already truncated away we've never had any desire to add additional information.

The new droplet API does appear to have an arbitrarily long string to store more data in, but I suspect the value there is still governed by this unfortunately small table column. If you can establish that CAPI will utilize a longer string I'd be happy to take a look at adding this additional information.

nebhale commented 4 years ago

Note, we've already taken this requirement into account in Cloud Native Buildpacks. If you build a Spring Boot application into an image with it, you'll see a ton of metadata including information about Spring Boot's version and dependencies.

robachmann commented 4 years ago

Thanks for your reply and good to see you've already taken this into account for cloud native buildpacks. If I understand you correctly, even cloud native buildpacks wouldn't allow me to query that data as long as CAPI has these column restrictions. Until her stepping down, I would have asked @abbyachau for her input here, I'm not sure who's been nominated for her succession. Maybe @emalm knows?

nebhale commented 4 years ago

Well there are two different strands to explore here.

  1. Current Java Buildpack. You probably want to ask @ssisil or @zrob if there has been any change in the CAPI db to allow more data into that column, which is likely what you're seeing in the droplet API.
  2. Move of CAPI to using Cloud Native Buildpacks. I don't think there's been any real exploration or requirements about how to expose the BoM metadata via any API. You'll want to pass those requirements on to the same group to make sure they aren't missed.
emalm commented 4 years ago

@robachmann: Currently @zrob is the only nominated candidate for the CLI project lead (via cf-dev), and the nomination period ends this evening (PST).

robachmann commented 4 years ago

Hi @zrob: Could you please tell Ben if the mentioned column has been expanded to hold that much information or what your recommended way would be to add framework- (or even library-) versions to the inventory? Thank you so much!

zrob commented 4 years ago

Hey @robachmann and @nebhale

First off, I think it makes a lot of sense to make this kind of build data available. That being said, I did some research and have some interesting findings. Sorry, but this is gonna be a long one.

To the quick question of "is the column larger", the answer is no. Additionally, just making that column longer will not be enough to solve the provided use case.

Java Buildpack and Cloud Controller interaction

The section that @nebhale linked in the detect script that performs truncation is not always propagated to the API. This is because that information is surfaced in the detect script, which is not executed during staging if a buildpack is supplied at push time. I think the easiest way to grok this is by example.

Push using auto detection

cf push spring-app

Droplet output:

...
        "buildpacks": [
            {
               "name": "java_buildpack",
               "detect_output": "client-certificate-mapper=1.11.0_RELEASE container-security-provider=1.16.0_RELEASE java-buildpack=\u001B[34mv4.23\u001B[0m-https://github.com/cloudfoundry/java-buildpack.git#27c704b java-main java-opts java-security jvmkill-agent=1.16.0_RELEASE open-jdk-like-jr...",
               "buildpack_name": "client-certificate-mapper=1.11.0_RELEASE container-security-provider=1.16.0_RELEASE java-buildpack=\u001B[34mv4.23\u001B[0m-https://github.com/cloudfoundry/java-buildpack.git#27c704b java-main java-opts java-security jvmkill-agent=1.16.0_RELEASE open-jdk-like-jr...",
               "version": null
            }
         ],
...

Push and specify java buildpack

cf push spring-app --buildpack java_buildpack

...
         "buildpacks": [
            {
               "name": "java_buildpack",
               "detect_output": "",
               "buildpack_name": "",
               "version": null
            }
         ],
...

Python buildpack by both autodetect and specification

...
         "buildpacks": [
            {
               "name": "python_buildpack",
               "detect_output": "python",
               "buildpack_name": "python",
               "version": "1.6.36"
            }
         ],
...

The python buildpack is able to provide the same information whether it is invoked by auto detection or by user specification. This is because the python buildpack provides a config.yml file via a supply script, which the java buildpack does not do.

It feels like there are a number of steps that would need to happen to make this possible on the current buildpack. First we would need a mechanism to provide the same results on the droplet, regardless of whether the java buildpack is invoked by auto detection or user specification. A config.yml could provide that today if the java buildpack provided that. I'm unsure if there's a downside to providing a supply script that drops off a config.yml.

Given a way to get the same metadata regardless of invocation method, the Cloud Controller could learn to accept new keys for the java buildpack to provide additional data. It would take some thought to consider whether that should be arbitrary data of any format and shove it into a string field, or to accept something more structured that both sides of the interface could agree on.

It is also worth mentioning that Cloud Foundry is currently working to adopt kpack as the staging implementation. This means that CNBs will be used and will provide all the information that @nebhale linked above. It may make more sense to consider how to surface the rich data available via kpack and on the generated OCI images rather than enhancing the current buildpack interface.

zrob commented 4 years ago

fyi a few folks @ssisil @danfein @pivotalrobbie @selzoc @goonzoid

nebhale commented 4 years ago

Since there's been no response on this issue in a couple of weeks, I'm going to close it. If you'd like to see it re-opened, please comment on the issue and I'll reopen it.