quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.56k stars 2.62k forks source link

Quarkus YAML configuration keys are implicitly escaped #11744

Closed yrodiere closed 4 months ago

yrodiere commented 4 years ago

Describe the bug When an application.yaml file contains keys that include dots (quarkus.datasource), these are implicitly escaped and considered as a a single key ("quarkus.datasource").

Expected behavior I would expect the key to be split and interpreted as a composite "path" : quarkus.datasource => quarkus, then datasource.

That behavior would be consistent with Spring's.

I also believe that automatically escaping keys goes against the least surprise principle: if I wanted to escape a key, I would do it myself.

Actual behavior The keys that include dots are interpreted as a single key (quarkus.datasource).

Worse: as the keys are at the root of the yaml file, they are considered as custom keys, and the user doesn't get any warning about configuration keys being invalid. Quarkus simply proceeds with bootstrap, ignoring the configuration keys completely.

To Reproduce

Steps to reproduce the behavior:

  1. Clone https://github.com/yrodiere/quarkus-bug-yaml
  2. Run ./mvnw quarkus:dev
  3. See how the banner appears even though application.yaml contains quarkus.banner.enabled: false

Environment (please complete the following information):

Additional context

I spent considerable time trying to find out why I was getting an error suggesting that I check I declared JPA entities (I did) when using following configuration file. As it turns out, the datasource configuration was simply being ignored, resulting in JPA entities being ignored as well...

quarkus.ssl.native: false

quarkus.datasource:
  db-kind: postgresql
  jdbc.url: jdbc:postgresql://${POSTGRES_HOST}/${POSTGRES_DB}
  username: ${POSTGRES_USER}
  password: ${POSTGRES_PASSWORD}

quarkus.hibernate-orm:
  database.generation: update

"%dev":
  quarkus.datasource:
    url: jdbc:postgresql:hsearch_demo
    username: hsearch_demo
    password: hsearch_demo
  quarkus.hibernate-orm:
    database.generation: drop-and-create
    sql-load-script: test-dataset.sql
gsmet commented 4 years ago

I think just the fact that the properties end up being ignored and you have no useful error message is a massive issue.

/cc @radcortez @dmlloyd

dmlloyd commented 4 years ago

This is operating as designed. If an intermediate key was given for a log category (for example), it must be handled as a single string element. Syntactically, there's no real good way to disambiguate this case. Having compound keys in YAML is generally unsupported (that is, the concept does not exist nor coherently map to YAML); if you compounded any key except for the root "quarkus.xxx" you would have gotten an exception.

I think the only useful thing we can or should do here is to add a validation that tests to see if the initial key segment begins with "quarkus."; if so, we should report the property name as an unrecognized configuration key instead of ignoring it as though it were part of some other config namespace.

yrodiere commented 4 years ago

This is operating as designed. If an intermediate key was given for a log category (for example), it must be handled as a single string element. Syntactically, there's no real good way to disambiguate this case

The documentation suggests wrapping the key with double quotes:

# For configuration property names that use quotes, do not split the string inside the quotes.
quarkus:
  log:
    category:
      "io.quarkus.category":
        level: INFO

Is that pointless?

IIRC the YAML spec defines separate syntactic elements for quoted and non-quoted keys...

dmlloyd commented 4 years ago

The interpretation of quoted or non-quoted keys is actually schema-dependent; while they're syntactically distinct, semantically having a key or value of "foo" versus foo is equivalent in most contexts AFAICT. So generally speaking I'd say the quotes are useless unless they are necessary at a lexical level for whatever reason (for example, embedding a : or something like that).

yrodiere commented 4 years ago

Ok, I understand.

Apparently Spring Boot does not have this problem because keys including dots are always use in a context where they are expected, and they never have nested keys:

Key Default Value Description
logging.group.*   Log groups to quickly change multiple loggers at the same time. For instance, logging.group.db=org.hibernate,org.springframework.jdbc.
logging.level.*   Log levels severity mapping. For instance, logging.level.org.springframework=DEBUG.

It removes the problem entirely, and enables a nice flexibility when writing configuration: we don't have to stick to nested structure everywhere.

Regarding solutions, now...

I suppose Spring Boot's solution is a no-no in Quarkus, since it requires a different structure for configuration properties.

An alternate would be to pass the "raw" configuration tree to extensions that need that kind of syntax, and let them automatically detect where the category ends. That's only possible if the configuration is reasonably simple, however, since we'd be missing out on automatic conversion and a whole lot of features. And I suppose when it comes to maintenance, this would become a big pain point.

Alternatively there could be a specific syntax for such keys? Double-double-quoting comes to mind as a way to just "make it work", though it would definitely be ugly:

quarkus:
  log:
    category:
      "\"io.quarkus.category\"":
        level: INFO

I suppose there could be more elegant ways to escape them, e.g. </> (or others):

quarkus:
  log:
    category:
      <io.quarkus.category>:
        level: INFO

But then we're changing the syntax, and it would become rather different from *.properties files. And of course, if we want to remove auto-escaping by introducing this syntax as the exclusive way to configure logging, we'd have to break backward compatibility...

Oh well. I guess it's too late for that kind of changes :/

radcortez commented 3 years ago

Maybe a little crazy idea:

When we flatten, we could include both versions of the key / value, the quoted and unquoted version when we find a a key that requires handling as a single string element.

Any thoughts?

dmlloyd commented 3 years ago

I think this could lead to hard to debug problems. Why does it need to be ambiguous?

radcortez commented 3 years ago

Ambiguous in which way?

dmlloyd commented 3 years ago

What I mean is that there is no "quoted an unquoted version". The YAML syntax is already unambiguous, and we already have distinct notions of singular (foo.bar.baz) and compound (bar.baz in foo."bar.baz") key segments. A single YAML entry equals a single key segment. That's very straightforward and clear AFAICT, so I'm not sure what problem we're really trying to solve here.

yrodiere commented 3 years ago

@dmlloyd We're trying to make YAML more convenient to use, and to limit the amount of surprise for new users

Some competing frameworks (Spring Boot, to name it) understand dots in YAML properties when parsing the configuration. Quarkus ignores them.

As discussed above, the main problem is that we won't get an error when we write a configuration file assuming this is supported in Quarkus. You said this could be fixed, and maybe it's already been. In any case, this fix would be great: it would spare a few new users a great deal of head-scratching.

But even if that's fixed, the fact remains that there are people out there who find it inconvenient to be unable to "fold" property names, and to be forced to spell out every component of properties as a distinct node in the YAML tree.

For example, this won't work:

quarkus.ssl.native: false

quarkus.datasource:
  db-kind: postgresql
  jdbc.url: jdbc:postgresql://${POSTGRES_HOST}/${POSTGRES_DB}
  username: ${POSTGRES_USER}
  password: ${POSTGRES_PASSWORD}
"%dev".quarkus.datasource:
  jdbc.url: jdbc:postgresql:hsearch_demo
  username: hsearch_demo
  password: hsearch_demo

quarkus.hibernate-orm:
  database.generation: update
"%dev".quarkus.hibernate-orm:
  database.generation: drop-and-create
  sql-load-script: test-dataset.sql

I'll have to write this instead:

quarkus:
  ssl:
    native: false
  datasource:
    db-kind: postgresql
    jdbc
      url: jdbc:postgresql://${POSTGRES_HOST}/${POSTGRES_DB}
    username: ${POSTGRES_USER}
    password: ${POSTGRES_PASSWORD}
  hibernate-orm:
    database:
      generation: update

"%dev":
  quarkus:
    datasource:
      jdbc:
        url: jdbc:postgresql:hsearch_demo
      username: hsearch_demo
      password: hsearch_demo
    hibernate-orm:
      database:
        generation: drop-and-create
      sql-load-script: test-dataset.sql

Is this a big deal? Probably not. Would it be nice to improve it? Some apparently think so :)

radcortez commented 3 years ago

The main problem is that in a ConfigSource we need to use a String key value pair, meaning that we have to flatten the YAML to that format. We had problems with this flattening before, for instance: https://github.com/smallrye/smallrye-config/issues/216, https://github.com/smallrye/smallrye-config/issues/391.

Regarding the dots, we parse it as a single key, to express context, like we do with the logging categories, but we could also keep a second entry key that represents the entire hierarchy. Something like:

foo.bar:
  val:
    foobar

Currently translates too: "foo.bar".val=foobar

Adding a second entry we could have both combinations: "foo.bar".val=foobar foo.bar.val=foobar

And should allow you to retrieve the value when you want to use an hierarchy or when you want to represent single keys.

A few downsides:

dmlloyd commented 3 years ago

The main problem is that in a ConfigSource we need to use a String key value pair, meaning that we have to flatten the YAML to that format. We had problems with this flattening before, for instance: smallrye/smallrye-config#216, smallrye/smallrye-config#391.

Sure, but this is due to a design problem of MP config itself, which is a separate discussion (e.g. https://github.com/eclipse/microprofile-config/issues/550) and something we should look at independently in SmallRye (for 2.0 perhaps?).

Adding a second entry we could have both combinations: "foo.bar".val=foobar foo.bar.val=foobar

But if the second combination conflicts with another config property, suddenly setting one entry in YAML can set more than one property on the back end which can result in errors or surprising behavior - even security vulnerabilities. Even more confusingly, these would have different behavior:

foo:
  bar:
    baz

vs

foo.bar:
  baz

…since the latter will define an additional key. Since the whole point of the idea would be to cause foo.bar to have the same behavior as foo:\n bar: I think this would be even worse than the status quo. A feature that works sometimes but also has hidden side effects is worse than not having that feature.

radcortez commented 3 years ago

Sure, but this is due to a design problem of MP config itself, which is a separate discussion (e.g. https://github.com/eclipse/microprofile-config/issues/5450) and something we should look at independently in SmallRye (for 2.0 perhaps?).

Yes. BTW, this should be the right link: https://github.com/eclipse/microprofile-config/issues/545. And I agree that this should be modified.

I think this would be even worse than the status quo. A feature that works sometimes but also has hidden side effects is worse than not having that feature.

I'm fine either way. Was just trying to think if there was some solution we could implement.

amexboy commented 4 months ago

+1

I had one of these today. I'm just getting started with Quarkus and spend a bit over an hour trying to figure out why quarkus was trying to start a test container while I have the jdbc url already configured.

It also seems to have compromised the whole configuration:

quarkus:
  http:
    port: 9090
  datasource:
    db-kind: mssql
    username: ***
    password: ***
    jdbc.url: jdbc:sqlserver://localhost;database=dbname

Causes Caused by: io.quarkus.runtime.QuarkusBindException: Port(s) already bound: 8080: Address already in use. Any idea why having jdbc.url would cause the port configuration to not be taken?

yrodiere commented 4 months ago

I had one of these today. I'm just getting started with Quarkus and spend a bit over an hour trying to figure out why quarkus was trying to start a test container while I have the jdbc url already configured.

cc @radcortez if you've got some time :)

Causes Caused by: io.quarkus.runtime.QuarkusBindException: Port(s) already bound: 8080: Address already in use. Any idea why having jdbc.url would cause the port configuration to not be taken?

This is probably unrelated. Generally I get this when I try to launch an app but another one is already started. Rarely, that other app is running in the background so hard to spot. I think the error message also includes some info on how to spot which application is using this particular port (e.g. a Linux command); isn't it?

EDIT: Thoug I wonder why it's using port 8080 if you're telling it to use port 9090... :thinking: Some other profile-specific config, maybe?

amexboy commented 4 months ago

I do have something else on 8080. However, it shouldn't try that. Changing the jdbc.url bit fixed the problem. So, no idea what is causing this.

On Mon, Apr 22, 2024, 10:16 Yoann Rodière @.***> wrote:

I had one of these today. I'm just getting started with Quarkus and spend a bit over an hour trying to figure out why quarkus was trying to start a test container while I have the jdbc url already configured.

cc @radcortez https://github.com/radcortez if you've got some time :)

Causes Caused by: io.quarkus.runtime.QuarkusBindException: Port(s) already bound: 8080: Address already in use. Any idea why having jdbc.url would cause the port configuration to not be taken?

This is probably unrelated. Generally I get this when I try to launch an app but another one is already started. Rarely, that other app is running in the background so hard to spot. I think the error message also includes some info on how to spot which application is using this particular port (e.g. a Linux command); isn't it?

— Reply to this email directly, view it on GitHub https://github.com/quarkusio/quarkus/issues/11744#issuecomment-2068772603, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABX63LUI4IHQIZKSD3ENXE3Y6TBNFAVCNFSM4QQLKAS2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBWHA3TOMRWGAZQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

radcortez commented 4 months ago

There is not much we can do here without a considerable amount of effort:

Either ditch the MP Config API or build something that can load each configuration source without additional conversions. See https://github.com/eclipse/microprofile-config/issues/550

Or Refactor our configuration to not require middle dynamic keys like described in https://github.com/quarkusio/quarkus/issues/11744#issuecomment-683820464

If there are other ideas, I'm happy to hear them :)

dmlloyd commented 4 months ago

I think the jdbc.url thing is due to JDBC dev services starting up based on some configuration heuristic. The fact that the config key is also incorrect is fairly coincidental I think (however I would have expected there to be an error along the lines of Invalid configuration key "quarkus.datasource."jdbc-url"" or something like that as well; the fact that this error was not reported sounds like a bug to me).

IMO we absolutely do not want to go down this route. It opens the door to many parseability problems that we really don't want to have to deal with.

radcortez commented 4 months ago

I tried it, and I indeed got the expected warning:

WARN [io.qua.config]] (Quarkus Main Thread) Unrecognized configuration key "quarkus.datasource."jdbc.url"" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo

dmlloyd commented 4 months ago

I'm going to go ahead and close this issue since we do not want to support the Spring-style key splitting for various reasons. For any other related problems, we should have separate issues.