avaje / avaje-config

Application configuration / properties loading for JVM applications
https://avaje.io/config
Apache License 2.0
50 stars 8 forks source link

SPI for initialization #43

Closed agentgt closed 1 year ago

agentgt commented 1 year ago

For a variety of reasons I often need to do my own initialization.

Avaje-config uses a plain static single and not the static service locator pattern like logging facades and their frameworks do.

This is problematic if you want to guarantee something always happens before or after avaje-config has loaded.

Where this usually rears its head is unit testing. It can be difficult and error prone to control the initialization of Service Locators. It's also a problem if you want to plug in a different implementation for testing or perhaps a threadlocal version for some sort of context config.

The simple solution is to make an interface like ConfigProvider or ConfigurationProvider (I'm still confused what the difference is between the two) that gets loaded with the ServiceLoader and if there are no implementations then a default one is loaded.

Consequently I recommend making an AbstractConfiguration that makes it easy for others to make their own custom Configuration implementations (again not sure if Config or Configuration is the better choice here).

SentryMan commented 1 year ago

ConfigProvider or ConfigurationProvider (I'm still confused what the difference is between the two) that gets loaded with the ServiceLoader

Adding a hook to do something before config initializes seems pretty doable.

If you want a behavior after config loads from files/env/whatever, you can implement ConfigurationSource which gets service loaded. From there you can have some logic to add additional properties. Do you want an additional hook after that?

AbstractConfiguration

Configuration is already an interface? Though I guess it is a long one.

agentgt commented 1 year ago

Do you want an additional hook after that?

Yes because what I do in my other config framework is to set System properties (e.g System.getProperties()) for legacy and/or third party libraries.

There are lots of libraries that are only configurable via a system property.

Thats my main use case.

I guess another is to replay configuration loading events in the advent of failure but the library doesn't expose that other than the System.Logger and even then it doesn't log loading IIRC. However that is a separate issue and can probably be fixed by just adding more logging and/or a pluggable Logger.

That is why I brought up the AbstractConfiguration or DefaultConfiguration instead of an interface (will have the interface as well).

The idea being you can easily override methods to wrap before and after behavior. That is much easier I think than providing a whole bunch of SPI hooks like PostProcessConfig which wreaks of Spring :)

SentryMan commented 1 year ago

set System properties (e.g System.getProperties())

Have you tried using config.load.systemProperties=true? this will load all properties into system properties after all configurations are processed.

agentgt commented 1 year ago

Yes I did see that and I was quite pleased however I was going to remove some properties from being put in (sensitive properties). To be honest I could probably make that feature work but would still like control of initialization.

I guess that is why I'm exploring this library (besides the petclinic project) is frankly the one we have in house is way too flexible so I don't mind any hesitation on adding features.

I made that mistake with ours and I like how compact this library is.

SentryMan commented 1 year ago

I guess what's flying past me is what exactly you mean by initialization. Do you just want pre/post hooks?

For the system property thing I'm envisioning you could make a ConfigurationSource with a load method like this.

  @Override
  public void load(Configuration configuration) {

    var bannedProps = configuration.set().of("secretPropertyNames");
    var props = configuration.asProperties();
    props.entrySet().stream()
        .filter(e -> !bannedProps.contains(e.getKey()))
        .forEach(e -> System.setProperty(e.getKey().toString(), e.getValue().toString()));
    }
agentgt commented 1 year ago

Yeah I agree that might work.

I guess what's flying past me is what exactly you mean by initialization. Do you just want pre/post hooks?

Im going to take a step back and explore some more.

Some of the initialization stuff I did before in various projects is no longer needed.

Some of its scope stuff (eg different config based on context$ and some of its hook.

My gut is that I eventually will need it but maybe that is not true.

However the real issue might be the API. I’ll add more later on why.

Anyway I massively appreciate taking the time to help me figure out stuff!

I’ll have to figure out someway to make up for it.

rbygrave commented 1 year ago

like ConfigurationProvider that gets loaded with the ServiceLoader and if there are no implementations then a default one is loaded.

No problem with doing that in principal although my gut is more telling me a "Lifecycle listener" more directly helps with what I suspect you are after here?

Where this usually rears its head is unit testing.

Hmmm, to me that is not expected as I tend to think that if we need anything more interesting in terms of testing configuration than what we can already do with application-test.properties then we should look really really hard at that. IMO something somewhere got more complex that it should.

plug in a different implementation for testing

Just to say I've never felt this need. To some extent it is the opposite of what I'm personally looking for in testing configuration. If some test needed more than different configuration values per say and instead wanted an entirely different configuration implementation then my gut says that more likely points towards code in src/main and how something is configured in main. I'm personally not intending to ever choose this path.

or perhaps a threadlocal version for some sort of context config.

In the future world dominated by Virtual Threads we need to be careful leaning on ThreadLocal too much. Have you hit this as a real requirement? My gut is telling me it's not keen.

Consequently I recommend making an AbstractConfiguration that makes it easy for others to make their own custom Configuration implementations

I'm not sure about making much of the internals public. Maybe put off the AbstractConfiguration part of the change into a different PR/Issue.

agentgt commented 1 year ago

Yeah let me rethink and check how I use config these days. I jumped the gun I think because of previous experiences and biases.

K8S and faster java startup has kind of changed everything for config for me these days. I no longer need the flexibility I used to.

And the threadlocal is a strong point but I think I do want some sort of lifecycle.

agentgt commented 1 year ago

I looked around at how we are using Config and I'm not sure I can make avaje-config work for our existing code base. The TL;DR is we are more instance based and less singleton on the config.

The initialization is sort of a red herring and I apologize for that. It's more the core API is incompatible with how we do things.

It certainly is worthwhile for a System.properties replacement in a microservice but we are just not there yet (ie fully microserviced) and I'm not sure we will ever be (or should be).

As for context, scope, testing and lifecycle stuff that avaje-config is missing and ultimately it boils down to the fact that almost all access is basically like System.getProperty albeit with initialization and some level of converting and refresh.

How our internal config library works is that it is composed of three modules:

  1. A Config wrapper that is analogous to Typesafe/lightbend Config and Microprofile config. Basically a more powerful Map<String,String>.
  2. A low level KeyValue (think Entry<String,String>) loading system that basically takes a URI and turns it into KeyValues which is essentially Iterable<KeyValue>.
  3. Common initialization aka bootstrap that uses the KVS module and does various other initialization and logging things <-- this was the part I was thinking about replacing with avaje-config

The Config wrapper module is very much like Lightbend Config pleasantly by accident.

What lightbend config and our library can do is take a subset of the Config and return it as a Config. This is useful if you have many instances of the same thing you need to configure. Thus we actually pass Config that isn't exactly the same key names as global config around a lot.

An example is say we have database properties where its like database.[tenant].* where tenant is variable.

e.g. database.example.username

What we do in our library is Config dbConfig = config.atPath("database").atPath("example");

Now to get username you do dbConfig.get("username") and in fact this is how our annotation processor works. It expects you to have "narrowed" the config such that only undotted keys are used.

Now could this be done with avaje-inject Configuration with Config.asConfiguration(...)? Maybe but it doesn't feel right for the spirit of this library at the moment.

Ultimately I think the best course of actions is for me to try to opensource our in-house library and figure out what can be learned and passed on to you guys while reducing the complexity of the Config and KVS system.

SentryMan commented 1 year ago

Why not do something like this?

var configPrefix= "database.example."
var username = Config.get(configPrefix+"username");
var password = Config.get(configPrefix+"password");
agentgt commented 1 year ago

var tenent1="example"; var username = Config.get("database.%s.username".formatted(tenent1));

Because the configuration may not come from a singleton. It maybe loaded dynamically.

But yes I did indeed think of creating my own wrapper or reuse our config library to call Config.get but that kind of defeats the purpose.

That is the core problem. There isn't this concept of a config instance. It isn't really that much better than System.properties. And if I'm writing low coupling middle layer modules why would I not just use System.properties if it is singleton?

A config instance is also used for testing the mapping of config to object albeit I suppose you could do something there as well.

agentgt commented 1 year ago

You have to understand I'm coming from the idea of Config is not singleton like Loggers because almost all other libraries do it the instance way:

And in large part its because these libraries are expecting to work with something else that handles singletons like DI framework.

That does not mean Avaje-config approach is wrong. I just haven't wrapped my head around the best way to do what we do which is in large part mapping key values to actual objects pojo/record like objects instead of the services pulling the configuration themselves.

Does that make since?

But there is a newer world order with microservices and such so maybe static config is a better approach but currently it doesn't fit in our model.

SentryMan commented 1 year ago

Yeah, I see where you're coming from. I come from a spring background where configs are basically all there at or very close to startup. I like avaje because now my static utility classes don't need an entire DI context to retrieve properties and do their work.

agentgt commented 1 year ago

I like avaje because now my static utility classes don't need an entire DI context to retrieve properties and do their work.

Indeed and that is my hesitation even on the initialization front particularly because I'm exploring a kind of modularized dropwizard approach (aka mixed stack) for avaje and various other libraries (particularly annotation processing) by implementing Spring's Petclinic without Spring.

Anyway in the coming weeks I should have a more clear picture on what works at least with annotation code generating libraries.

rbygrave commented 1 year ago

What we do in our library is Config dbConfig = config.atPath("database").atPath("example");

Well I think this would be pretty natural to avaje-config. For example, what we could do naturally is:


Configuration allConfiguration = Config.asConfiguration()
Configuration dbConfiguration = allConfiguration.for("database");
Configuration dbExampleConfiguration = dbConfiguration.for("example");

// or 
Configuration dbExample = allConfiguration.for("database.example");

Config is a scope around the "global" or "all" Configuration + the initialisation CoreConfiguration.initialise(). It's a convenience to the "global" / "all" configuration with all it's methods calling through to the underlying "global" configuration instance.

Edit: In case it helps, Config shall have no "logic" apart from CoreConfiguration.initialise() that is, all the methods in Config just proxy through to the underlying "global" Configuration.

SentryMan commented 1 year ago

Wow, I didn't know you could do that.

rbygrave commented 1 year ago

Well, that for(...) or atPath(...) method doesn't exist yet but it could and probably should.

SentryMan commented 1 year ago

Thus we actually pass Config that isn't exactly the same key names as global config around a lot.

So when #46 deploys we will get something like this. You can wire the returned configuration or multiple instances as beans to modify/retrieve different properties.

I just haven't wrapped my head around the best way to do what we do which is in large part mapping key values to actual objects pojo/record like objects instead of the services pulling the configuration themselves.

For static global configs, you can probably use the Config.get/getInt/whatever directly on the fields. Otherwise, I'm thinking you can wire a Configuration instance to the class and pull the configuration into your fields from there.

Archaius 2.x Microprofile Config Typesafe aka lightbend config

I admit that I am totally unfamiliar with these libs. Got any more examples of how you use these so I can get a clearer picture of what I should attempt to add?

A low level KeyValue (think Entry<String, String>) loading system that basically takes a URI and turns it into KeyValues which is essentially Iterable.

I'm having some trouble visualizing this, got an example of how you use it? Does it call the URI and read the stream? or else is the URI itself a KeyValue pair and you read that?

agentgt commented 1 year ago

I admit that I am totally unfamiliar with these libs. Got any more examples of how you use these so I can get a clearer picture of what I should attempt to add?

Yeah I was exploring what exactly are the killer features of our config wrapper and have been working on a simple contract of our API.

Here is an example of what our API looks like with callback dynamic properties removed:

https://gist.github.com/agentgt/78bd22d473d52d6e0c4ff9c45d5cd7d0

I have removed a bunch of convenience methods and what not to try to focus on what is happening but there are two features entirely missing from avaje-config but most of it boils down to error reporting and composing other properties.

In many config systems there is way more specific Map.Entry<String,String>. In our system we call this Property. In Microprofile config they call it ConfigValue. This entry has information about where the hell it came from and maybe some other stuff but the most important feature is that Property is essentially a lazy java.util.Optional.

By having a specific datatype to represent an entry as opposed to just making essentially Map interface is we get better error handling and composition.

But first let us get over the most important part of a config system. It needs excellent error reporting because often times the person configuring has very little details of the internals.

When we do something like:

String n = Config.get("some");

We get a lovely error message if "some" is missing (however I totally disagree with IllegalStateException as the error).

Now let us say we have some sort of value like type that we can convert from String that avaje does know about:

SomeType t = SomeType.parse(Config.get("some"));

Let us now assume "some" property is actually present and let us say its set to something like "1" (or anything else ambigous) but is just not in the right format that SomeType needs.

SomeType will in theory throw some exception like SomeType can't parse "1" but the user has no idea what property that is for or even if it is a configuration error. You see context was lost.

Of course the way to fix that is to make the conversion of SomeType happen inside a context we can attach correct error information. If you are pedantic enough you might notice the above is a good use case for the monad pattern but let's ignore academics for now.

You could do this with Avaje Config by adding a guess map like method.

Config.get("some", SomeType::parse);

Avaje would catch the exception that SomeType::parse throws and wraps (e.g. cause) with the information about which property. Unfortunately Avaje does not know anything about where the Property came from unlike all those libraries I mentioned even including Typesafe/lightbend config but I guess you at least now get the property name that failed conversion.

So the above kind of works for error handling but doesn't get you composition. That is say you want a fallback property. Think Optional.or.

Anyway moving on one of the most powerful features in my gist and in our config framework is to turn our Config into Function<String,String> while still retaining error reporting.

For example going back to Avaje Config:

We could do:

Function<String,String> f = Config::get;

// Nice we are completely decoupled from Avaje Config
// Now downstream let us say I want database.port

int port = f.compose("database."::concat).andThen(Integer::parseInt).apply("port");

If port is the wrong format say not an int you will get an exception that makes little since to the user that it is a configuration error.

In the gist we have a special Function<> implementation that allows you to compose functions while retaining the property info.

Function<String,String> f = config.asFunction();

Then if you do the previous example composition will work retaining error handling.

Ultimately I really do not like Avaje-config Configuration interface. I don't think it is much better than just dealing with a Map<String,String>. I realize some of the choices were probably because of older javas.

A lot of this is opinion and preference but here is the short list of problems:

Hey but despite all this I like the libraries opinionated loading of files and I realize APIs often make incorrect choices early but the best stick with those choices for backward compatibility while offering new choices.

I'm having some trouble visualizing this, got an example of how you use it? Does it call the URI and read the stream? or else is the URI itself a KeyValue pair and you read that?

This topic is complicated but basically the system loads plugins based on URI.scheme as well resolved MediaType which is usually by file extension.

In our system for example using load.properties style chaining you can do:

# main config.properties
_require_env=env://APP_PREFIX_//
_maybe_cmd=cmd://-D//
_maybe_https=https://internal/some.json

The env schema loads a plugin that loads properties from environment variables. The configuration is done by the URI in this case the first part of the path is used to select which environment variables.

The cmd:// uses command line arguments. There is also stdin:// which is useful to loading sensitive properties from stdin .

The https one will resolve mediatype and then pick a plugin based on that and in this case its JSON.

Does the world need the above complexity... maybe not but its more than 10 years old so it just kept getting more flexibile.

agentgt commented 1 year ago

Anyway what I really need to do is just opensource our config library and stop critiquing. Then you can look at that and decide what you like about it or not.

The petclinic app I have been working on has a subset of our in-house library so you can probably look at that as well once its done.

Right now Avaje-config once it has the event support it is a good system properties initialization library and I still might use it at some point particularly if property information like where it came from is added.

agentgt commented 1 year ago

You know what I'll just fork avaje-config and then you can see my idealist probably wrong point of view with the changes I would make to ajave-config :)

SentryMan commented 1 year ago

All the defaultValue methods that are not primitive are @PolyNull. There are very few null analysis tools that can handle that correctly and ultimately get(String key, String defaultValue) will always be @Nullable which is ridiculously because get(String key) is not.

I'm not sure I understand your meaning, can you expand on this? Also, there is that getOptional method if you didn't want to use the default value ones (maybe we should make Optionals work for lists and sets? something to look into later)

EDIT: oh wait I get it now How do you propose we fix it? or do we just take an L here?

I seriously do not understand why all the onChangeXXX methods have specific primitives yet do not use their consumer counter parts like IntConsumer....

We'd need a bunch of extra code to accommodate the different consumers. Is autoboxing that big of a problem?

EDIT: nvm we got this now.

property information like where it came from is added.

I can make the attempt to add this in, but my question is how often do you find yourself in situations where this info is needed? I can't seem to picture where this comes in handy considering on initialization avaje logs where it loaded properties from.

Anyway what I really need to do is just opensource our config library and stop critiquing.

How can one learn and improve if they don't hear different perspectives? Even if we somewhat disagree on some stuff, I'm having a blast. :)

rbygrave commented 1 year ago

All the defaultValue methods that are not primitive are @polynull.

FYI: In master and in the next release the defaultValue are all non-nullable, the whole API is non-nullable (using @NonNullApi) so getOptional(key) is required when using optional properties that don't have a default.

if property information like where it came from is added.

FYI: About to merge in a PR where all the configuration entries are in an internal CoreEntry type. It should be pretty natural to add 'key' and 'source' to that. I think you are wanting those for the Function<String,String> type use.

Archaius 2.x

I've used that quite a lot. Didn't really like it. I think my biggest gripe being that the easy use cases that you want to use 90% of the time seem unnecessarily verbose (+ more painful use with tests / testing setup).

Microprofile Config

I've not used it per say but seen it's use in the Helidon internals which is probably a good example of it's use. I'll go look at it again. I did note that it was "relatively big" in terms of shear size of code fwiw.

agentgt commented 1 year ago

Helidon by far ignoring complexity and payload has the best configuration system I have seen. It’s well designed and modularized.

I didn’t include it in the list earlier because I have not used its raw version (ie non microprofile).

I do think the event system in avaje is not right as I made a similar mistake.

Rarely do you want to “react” to a single property and not should you fire events for each change.

What I learned from experience is you take a snapshot of the config let the user modify the snapshot and then update the real config map based on that and then notify each listener with the new map.

I’ll put a code example later.

rbygrave commented 1 year ago

want to “react” to a single property and not should you fire events for each change.

Should be address now by the series of:

... with ModificationEvent as our event bulk/batch of changes.

The likely followup is to deprecate the existing 'single property onChange()' listeners in favor of using the onChange(Consumer ...).

SentryMan commented 1 year ago

Wouldn't you know it, I just got in a situation at work where I just need to listen to a single property change. Can you give an example of how I could use the batch onChange to listen to an batch.region property change?

rbygrave commented 1 year ago

listen to a single property change

Well, I was thinking it might be good to keep the 'single string' onChange listener method if only for the reason that people just know how to use it (so deprecating the int, long, boolean variants and leaving the String one fully supported).

... but then we look at an example and compare onChangeLong to onChange ModificationEvent

example ...

Existing approach using onChangeLong

    AtomicLong valueHolder0 = new AtomicLong(1);
    data.onChangeLong("some.longKey", valueHolder0::set);

New approach using onChange ModificationEvent:

    AtomicLong valueHolder = new AtomicLong();

    data.onChange(event -> {
      long newValue = event.configuration().getLong("some.longKey");
      valueHolder.set(newValue);
    }, "some.longKey");

... and it's an example like that which makes me think we should not deprecate onChangeLong(). At a minimum we need to repeat the key that we are interested in. It's a lot less appealing to use for these single property change cases.

SentryMan commented 1 year ago

The event listener listens to all changes right? You also might need a check to determine if some.longKey was actually modified.

EDIT:

The event listener listens to all changes right?

I'm wrong on this I guess.

SentryMan commented 1 year ago

maybe we can change the name of the Event onChange to onBatchChange to further distinguish that it's for listening to multiple properties.

rbygrave commented 1 year ago

For myself in my IDE I'd:

... and I'm looking at all the onChange* methods and then looking to pick the right one for the job. So in that sense less I'm keen on onBatchChange based on the way I use the IDE. My gut says that decent javadoc could be enough here. I'll improve the javadoc and see what that looks like.

SentryMan commented 1 year ago

looking at all the onChange* methods

you drive a hard bargain. How do we feel about onChangeBatch ?

rbygrave commented 1 year ago

you drive a hard bargain. How do we feel about onChangeBatch ?

So I've improved the javadoc a bit: https://github.com/avaje/avaje-config/pull/61 ... my gut is currently telling me to stick to onChange and get the javadoc good in terms of telling people what it is relative to the other onChange methods. I'm maybe concerned that adding 'Batch' as a suffix might actually confuse some folks. Conceptually they are all "onChange" and the distinction that devs / users should care about is if they are only interested in changes to a single property or multiple properties.

Hmmm, for me I think it might good to park it for a few hours and then come back to it.

rbygrave commented 1 year ago

Just thought, I'll release 3.0-RC1 as it is ... given you are right now doing stuff with onChange and just in case you hit the bulk onChange case.

SentryMan commented 1 year ago

@agentgt any other ideas on how to improve?

agentgt commented 1 year ago

@SentryMan @rbygrave I have to play with it but I would want to customize classpath Resource Loading (e.g. ClassLoader.getResourceAsStream()) for the initialization part.

Very often if you uber shade a jar the resources need to be loaded differently.

SentryMan commented 1 year ago

Very often if you uber shade a jar the resources need to be loaded differently.

Can you give an example? I would imagine all you'd need to do is modify the resource path to match the location in the uber jar? Would you really need to modify the classloader for that?

agentgt commented 1 year ago

Can you give an example? I would imagine all you'd need to do is modify the resource path to match the location in the uber jar? Would you really need to modify the classloader for that?

You don't need to modify the classloader if you can control the resource path but you don't entirely have control of that currently.

Basically you can get in a scenario where the classloader that avaje-config is using cannot see classpath resources that the application or other libraries can and this largely has to do with initialization.

The two scenarios I have seen this are with embedded containers (e.g. tomcat embedded) with embedded jars ( I believe spring boot does this) or uber jars where all the classes get mashed into a single jar.

Consequently many libaries including the JDK ServiceLoader allow you to pass a different ClassLoader. Does it need to be the ClassLoader? No it does not. We can make an interface (like I did with my library) like:

public interface ResourceLoader {
   public URL getResource();
   //or
   public InputStream getResourceAsStream();
}

Anyway I need to play around with some sample projects that I can share with you later to highlight the various initialization challenges with correct resource finding (modules also make this complicated).

I'll try later this week to get you guys something.

rbygrave commented 1 year ago

We can make an interface (like I did with my library) like ...

I've done that now via https://github.com/avaje/avaje-config/pull/66

SentryMan commented 1 year ago

I have to play with it but I would want to customize classpath Resource Loading (e.g. ClassLoader.getResourceAsStream()) for the initialization part.

Looks like that's that https://github.com/avaje/avaje-config/pull/66

EDIT: Jinx

SentryMan commented 1 year ago

@agentgt have any other SPI related stuff you would like to see?

agentgt commented 1 year ago

I should know later this week when I can test on the avaje (and other stuff) petclinic project.

SentryMan commented 1 year ago

I should know later this week when I can test on the avaje (and other stuff) petclinic project.

Do you want any assistance with that? I've sort of exhausted any further ideas unless somebody suggests something.

agentgt commented 1 year ago

Do you want any assistance with that? I've sort of exhausted any further ideas unless somebody suggests something.

Sorry for the late follow up. Yes I will soon make it public its just that I have been busy with various family / work related things.

School vacation is on right now so let me follow up next week.

SentryMan commented 1 year ago

Well, in any case, I'd count this as resolved for now. We can close this and you can create a new issue if you think of anything else.

agentgt commented 1 year ago

Oh yeah it for sure is. Sorry about that.

rob-bygrave commented 1 year ago

All good, closing then :)