quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.64k stars 2.65k forks source link

Maven extension plugin should enforce schema of extension metadata yaml #31625

Open holly-cummins opened 1 year ago

holly-cummins commented 1 year ago

Description

Several fields in the quarkus-extension.yaml file are treated as an array by some extensions, and a string by other extensions. We should enforce the typing to match what's in the docs.

We may also want to put similar enforcement in the registry itself to complain if improper data is uploaded, but let's start with catching the issue close to the source, so that the feedback loop is as small as possible.

Here are the fields where I've noticed problems:

Status (should be a string, sometimes treated as an array)

Example bad-case registry output (irrelevant bits trimmed):

  {
      "name": "Camel Async HTTP Client (AHC) Websocket",
      "metadata": {
...
        "status": [
          "stable",
          "deprecated"
        ]
      }
   {
      "name": "Camel IPFS",
      "description": "Access the Interplanetary File System (IPFS)",
      "metadata": {
...
        "status": [
          "stable"
        ]
      },
      "artifact": "org.apache.camel.quarkus:camel-quarkus-ipfs::jar:2.7.1",
      "origins": [
        "io.quarkus.platform:quarkus-camel-bom-quarkus-platform-descriptor:quarkus-bom-quarkus-platform-descriptor:2.7.7.Final:json:2.7.7.Final"
      ]
    }

This is most noticeable for deprecated extensions, where deprecated gets added on to the previous status. See, for example, https://github.com/apache/camel-quarkus/pull/3560/files#diff-853fef7d94cc4a8d9d8a7ed01a653a4fe0071710ce25e27f0923f1d75e79bf02 which added a deprecation element. However, there are 329 non-core extensions which treat status as an array, even if there is only one.

Codestart languages (should be an array, sometimes treated as a string)

Example bad-case registry output (irrelevant bits trimmed):

   {
      "name": "AWS Lambda",
      "description": "Write AWS Lambda functions",
      "metadata": {
...
        "codestart": {
          "name": "amazon-lambda",
          "kind": "example",
          "languages": "java",
          "artifact": "io.quarkus:quarkus-project-core-extension-codestarts::jar:3.0.0.Alpha4"
        },
      },
      "artifact": "io.quarkus:quarkus-amazon-lambda::jar:3.0.0.Alpha4",
      "origins": [
        "io.quarkus.platform:quarkus-bom-quarkus-platform-descriptor:3.0.0.Alpha4:json:3.0.0.Alpha4"
      ]
    },

There are about 9 of these in the registry.

Config (should be an array, sometimes treated as a string)

Example bad-case registry output (irrelevant bits trimmed):

    {
      "name": "JBeret - Batch Processing",
      "metadata": {
       "config": "quarkus.jberet.",
        "scm-url": "https://github.com/quarkiverse/quarkus-jberet",
      },
      "artifact": "io.quarkiverse.jberet:quarkus-jberet::jar:2.0.0",
      "origins": [
        "io.quarkus.registry:quarkus-non-platform-extensions:3.0.0.Alpha4:json:1.0-SNAPSHOT"
      ]
    },

JBeret seems to be the only extension with this issue.

Implementation ideas

No response

quarkus-bot[bot] commented 1 year ago

/cc @quarkusio/devtools (maven)

ia3andy commented 1 year ago

@holly-cummins have a look at all call to asStringList (which normalize the output to a List): https://github.com/quarkusio/quarkus/blob/main/independent-projects/tools/devtools-common/src/main/java/io/quarkus/platform/catalog/processor/MetadataValue.java#L42

like this: https://github.com/quarkusio/quarkus/blob/main/independent-projects/tools/devtools-common/src/main/java/io/quarkus/platform/catalog/processor/ExtensionProcessor.java#L54

I think it's a common practice to allow array defined as string when there is only one value.

holly-cummins commented 1 year ago

So I've normalised everything on the front-end in the extensions UI, I don't have a problem. :)

However, people were surprised when I mentioned that lots of extensions didn't follow the schema they thought we had. There was some feeling that we should we alert extension-writers if they deviate from the schema that's implied in the docs.

maxandersen commented 1 year ago
 "status": [
          "stable",
          "deprecated"
        ]

is breaking and shouldn't happen. code.quarkus.io don't find the deprecated ones afaics (as an example)

looking on registry the following extensions looks to be breaking the semantics of having more than 1 element:

 curl https://registry.quarkus.io/client/extensions/all | jq -r '.extensions[]|select(.metadata.status|type=="array")|select(.metadata.status|length>1)|.artifact'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  670k    0  670k    0     0   831k      0 --:--:-- --:--:-- --:--:--  833k
org.apache.camel.quarkus:camel-quarkus-caffeine-lrucache::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-cmis::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-dozer::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-elasticsearch-rest::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-hbase::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-hdfs::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-iota::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-microprofile-metrics::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-opentracing::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-ahc-ws::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-ahc::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-atomix::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-beanio::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-beanstalk::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-elsql::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-etcd::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-ganglia::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-hystrix::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-jing::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-msv::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-nagios::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-nsq::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-ribbon::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-sip::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-soroush::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-yammer::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-kamelet-reify::jar:2.7.1
ia3andy commented 1 year ago
 "status": [
          "stable",
          "deprecated"
        ]

is breaking and shouldn't happen. code.quarkus.io don't find the deprecated ones afaics (as an example)

looking on registry the following extensions looks to be breaking the semantics of having more than 1 element:

curl https://registry.quarkus.io/client/extensions/all | jq -r '.extensions[]|select(.metadata.status|type=="array")|select(.metadata.status|length>1)|.artifact'
 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                Dload  Upload   Total   Spent    Left  Speed
100  670k    0  670k    0     0   831k      0 --:--:-- --:--:-- --:--:--  833k
org.apache.camel.quarkus:camel-quarkus-caffeine-lrucache::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-cmis::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-dozer::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-elasticsearch-rest::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-hbase::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-hdfs::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-iota::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-microprofile-metrics::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-opentracing::jar:2.16.0
org.apache.camel.quarkus:camel-quarkus-ahc-ws::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-ahc::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-atomix::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-beanio::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-beanstalk::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-elsql::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-etcd::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-ganglia::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-hystrix::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-jing::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-msv::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-nagios::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-nsq::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-ribbon::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-sip::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-soroush::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-yammer::jar:2.9.0
org.apache.camel.quarkus:camel-quarkus-kamelet-reify::jar:2.7.1

code.quarkus.io is only reading the first value of the status as only one status should be specified. It doesn't break if it's wrong as it should be resilient as possible. IMO this problem should be detected upstream.

ia3andy commented 1 year ago

What I meant earlier is that it's important to have a schema and detect mistake as early as possible: