Open kdubb opened 1 month ago
I wrote an internal version of the module to quickly fix these issues for us. The solution is fairly simple with a few caveats because there is seemingly no way to "render" a key path in code consuming a config from the library; it always renders them itself before returning them.
So the fix looks like this:
ConfigImplUtil.parsePath
to retrieve the path from flattened keys.
a. Requires generating indexed paths with the [0]
part quoted to be a valid HOCON path.[.\s\"]
b. Join the elements using '.'If I find the time I can look at making a PR for this but I'm pretty swamped at the moment.
Thank you for the report.
I was considering deprecating and removing the HOCON source due to:
We have no plans to invest effort in an abandoned library unless it is super popular, which is not the case. For the time being, we do accept any PRs that improve the implementation, of course.
Would you consider moving away from the HOCON source?
I've read those as we use HOCON. As the previous maintainer has said a few times, the library works with a few known issues but no real showstoppers. All in all, it's much better than YAML for config (in practice YAML trades repetition for lots of unnecessary indentations).
That being considered, I would say that if SmallRye Config's YAML support included breaking keys with dots in them into key "paths"... we'd switch to that in a hearbeat. That's really "the thing" that makes HOCON better.
So a YAML format that treated:
this.is.a.key.path: the-value
as if it were
this:
is:
a:
key:
path: the-value
I would see no value in HOCON.
Not sure what SR Config is usign for YAML parsing but I know that being able to detect/specify quoated vs. unquoted keys is usually available (unless your using a meta-library like Jackson). So the above could be implemented and allow leaving quoted (single or double) keys alone.
We use snakeyaml as the parse which handles that case.
But we do handle it differently for a few reasons: https://github.com/quarkusio/quarkus/issues/11744
Yeah it's why we stopped using YAML in Quarkus, as I said you're trading repitition for indentation/newlines.
I'm wondering with we could add a "SimpleYAML" version to SmallRye and Quarkus. Where you get "simplified YAML" where common sense things like treating dots as path segments is done. It could be a configuration key but configuring your config source seems to make less sense than including quarkus-config-simple-yaml
and getting these features.
I think that pressing this idea as "common sense" distracts from the fact that doing this flattening would be incorrect, even if you hide it behind a mode. The reason is that a dot .
is a valid part of a configuration segment, and so deciding that it instead acts as a configuration separator for syntactic convenience means that many segments are now non-representable. Since this is not a part of core YAML, there is not any escape mechanism to make this work correctly, so if we wanted to support this, we'd need to layer on another escape mechanism, which would lead to other problems.
To me, "common sense" means "correct".
Ok, common sense might not be the correct terminology, but it certainly makes it simpler to use and achieves a configuration language that's much less wordy.
What's not representable if quoted keys are treated as a single segment?
Any segment with a .
in it (like a class name or logging category) would not be unambiguously representable.
How, if it's quoted it's treated as a single segment, if it's unquoted, dots are segment separators. I'm not sure how that is unambiguous.
Quoted how?
The same way you quote keys to include unsupported characters (e.g :
) currently.
"this.is.one.segment": value
'this.is.one.segment:including this': value
this.is.multiple.segments: value
# equivalent to
this:
is:
multiple:
segments: value
YAML makes no distinction between keys that are quoted and non-quoted by specification. This means it's generally not possible to distinguish these cases (i.e. whether or not a key was quoted) from a YAML-compliant parser like SnakeYAML. The solution is also incomplete, since things like foo."bar"
would (rightly) not be considered to be valid YAML.
You are arguing the specification of YAML and I am not. The spec clearly doesn't treat dotted keys as path segments. That doesn't mean it isn't a very useful thing to get the best of both worlds; removing the repetition of .properties
files without the unnecessary indentation and newlines required in YAML.
Regardless of the spec, SnakeYAML and most other YAML parsers do distinquish between quoted and unquoted keys during parsing. Mostly because you generally want some way of specifying this during serialization and they provide the same information during parsing.
So quoted vs unquoted is a nice simple way to achieve all this.
With regard to foo."bar".baz
.. that's a valid key in YAML (the quotes will be included) and it's easy to parse that as a single segment.
The specification is relevant for two reasons.
First of all, because the specification treats these cases as equal (AFAICT), then interpreting the document differently is in violation of the spec. The possible ramifications of this include (but are not limited to) having config be broken because some YAML-processing tool reinterpreted keys and de-quoted things that do not, by spec, need to be quoted.
The second reason is purely practical: I don't see how we can possibly know whether or not a key was quoted or unquoted after we get it out of the parser. So if that is indeed the case, we'd either have to hack into the parser or write our own, neither of which seems very appealing.
As I said previously, SnakeYAML surfaces parsed strings (including keys) as scalars with a ScalarStyle
which includes DOUBLE_QUOTED
, SINGLE_QUOTED
, and PLAIN
(amongst others).
No hacking needed.
From what I could tell, the ScalarStyle
only applies to the Dumper
and not the Parser
.
When I try to read foo.bar
or "foo.bar"
, the resulting key does not contain the quotes, but maybe I'm missing something. Can you also have a look?
Additionally, if we were to support it, I believe it would break the existing configuration. How would you solve it?
Even if we managed to solve the second problem, we still have the first one: that any random tooling might transform the file incompatibly because we're assuming things that are not promised in the specification. I wouldn't be opposed to some other quoting scheme - but such a scheme would have to be spec compliant at the very minimum.
From what I could tell, the ScalarStyle only applies to the Dumper and not the Parser.
The Parser
produces Event
s, ScalarStyle
is a property on the ScalarEvent
.
Even if we managed to solve the second problem, we still have the first one...
I admire your adherence to the standard, but I'm very much talking about "stretching it" 😆
This is why I suggested a completely different module, so that never the twain shall meet ( (e.g. SimpleYAML
but now I like NotSoIndentyAndNewlinyYAML
for the name).
The goal is to take small liberties with YAML, and use what the Parser
/Dumper
already provide to make the developer experience measurably better. Yep, we're doing things non-standard, but I can't see the drawback. Given that we're getting requests for keys from SR Config, not providing keys [1], we should be able to map whatever request comes in, to a "nicer to type" value.
Honestly, given the absolutely broken state of HOCON in SR Config; the idea of replacing it with a YAML that gives you nearly all the same convenience without relying on (technically) unmaintained libraries, seems like a real win.
The only drawback I see is if I load YAML into a "YAML Parser" I'm going to get keys like "this.is.a.key.path" instead of this: { is: { a: { key: { path: <value } } } }
. So... don't do that and expect to get anything but key paths (cause the objects won't have been created at each segment. The quoting issue really doesn't bother me. this."is.a".key.path
would show up exactly as you see it there. Anybody loading these YAML files should know to process them accordingly (or just use SR Config).
An actual nice name for it might be ConfigYAML
.
[1] - I know that SR Config does report keys to show config (e.g. in Quarkus) but the actual resolution seems to only rely on making requests of the ConfigSource
s. I do know the importance of showing the keys correctly too. When trying to figure out why our HOCON files weren't working we were able to plainly see that there were keys that were misrepresented.
Upon reflection, maybe ConfigYAML
is too... something. I could see people getting them mixed up if there were both a quarkus-config-yaml
and a quarkus-config-configyaml
extension 🤷♂️... I still like calling it ConfigYAML though haha
When I get some time, I will almost definitely implement this in our project, I've already got a HOCON config extension to get it working correctly.
Maybe the best course of action is to let me find the time to do that and report back. Once it's working we should learn pretty quickly if it causes any issues with editing, saving, etc. It wouldn't be a win if you're fighting with the IDE every you edit or save a "ConfigYAML" file.
Back to the original topic of HOCON. I really think you guys should fix or pull it. I'd vote for fix. It's not a terribly hard fix and HOCON is an accepted configuration format, but I would understand if you chose to drop it altogether.
The
Parser
producesEvent
s,ScalarStyle
is a property on theScalarEvent
.
Yes, but currently, from what I could tell, you would have to reimplement pieces of the parser to make it work. At least I couldn't find a way to get the quoted keys out of the box. I also looked into the snakeyaml issue tracker, and I couldn't find any related discussion. It seems it was never considered or required.
I understand the frustration, and we have had many users complain about the same thing in the past. I would like to improve this but don't see how to do it without a significant breaking change. Consider:
quarkus:
log:
category:
io.quarkus.category:
level: INFO
Translates to quarkus.log.category."io.quarkus.category".level
. Where io.quarkus.category
is a dynamic segment of the configuration (the logger). We know what composes the segment because it is quoted. If we change the behaviour to represent a multi-path, then it becomes quarkus.log.category.io.quarkus.category.level
, and you would required to mandate io.quarkus.category
in the yaml file to be quoted. It is all doable, but unfortunately, it is a significant breaking change.
Back to the original topic of HOCON. I really think you guys should fix or pull it. I'd vote for fix. It's not a terribly hard fix and HOCON is an accepted configuration format, but I would understand if you chose to drop it altogether.
Feel free to submit a PR; I'm happy to review and merge it. Thank you!
In HOCON, it's
Path.render
quotes anything it doesn't like, including things like%
for profiles,[]
for lists/arrays and/
used in projects like SmallRye FaultTolerance.Basically it produces a lot of quoted sections to the properties it produces that make them incorrect. Some examples:
%dev.some.property
becomes"%dev".some.property
org.acme.CoffeeResource/coffees/Retry/maxRetries=6
becomesorg.acme."CoffeeResource/coffees/Retry/maxRetries"=6
my.indexed.collection[0]=dog
becomesmy.indexed."collection[0]"=dog
There doesn't seem to be a way, currently, to represent these properties in HOCON files.