Allow thread-safe way to update Configuration

tinylog-org / tinylog

tinylog is a lightweight logging framework for Java, Kotlin, Scala, and Android

https://tinylog.org

Apache License 2.0

698 stars 82 forks source link

Allow thread-safe way to update Configuration #232

Open aberman opened 2 years ago

aberman commented 2 years ago

I have a Java agent which requires configuration at runtime. Currently, I am following https://github.com/tinylog-org/tinylog/issues/102 by creating an underlying logging provider and it mostly works. However, anything I do to Configuration yields an UnsupportedOperationException. My non thread-safe way right now reflectively modifies the frozen field so I can modify the Configuration. This is obviously not thread-safe and not a good solution.

I realize that locking the Configuration speeds things up, but it would be great if there was a thread-safe way to update it (without the UnsupportedOperationException occurring). I'm thinking a new method like the replace method (maybe change that one?) that creates a new Properties and sets it in the properties field in a synchronized way. I don't think it should matter if threads which are currently logging don't have the new properties because once the properties field is updated when I construct a new underlying TinylogLoggingProvider, it will load the new Configuration and be used for further logging.

q3769 commented 1 year ago

I'd like to understand this request a bit more....

@aberman:

What exactly happened when you say "anything I do to Configuration yields an UnsupportedOperationException"? Did you mean your application was making calls to, e.g., the Configuration.get method while the "reload" triggered by your reflection call was on-going at the same time, and that caused the UnsupportedOperationException?
Can you (or @pmwmedia) elaborate on why "locking the Configuration speeds things up"? I mean, the Properties class Javadoc, the Properties object holding the config data here is thread-safe without needing any external synchronization. In that case, why are we locking before accessing it?

To your point, the logging client threads probably don't care if the property data read is "fresh" at a given moment, as long as eventually all the client threads get the new data. In the same light, "how fast" the config data get refreshed probably doesn't matter, either. Maybe I am missing something here, thus the question #2 above. On the surface of the code, it seems the locking is not needed.

aberman commented 1 year ago

@q3769 :

No, the Configuration is frozen once it's loaded and you cannot modify anything in it once that happens. If you look in Configuration, you'll see that the set and replace methods throw the UnsupportedOperationException preventing any modification at runtime.

q3769 commented 1 year ago

Ok I see. Thanks @aberman.

@pmwmedia : I'd like to ensure I have the correct understanding on the business domain/feature level before trying to make change...

What exactly is the purpose of the frozen flag? Is it only to ensure no more writes after any read on the Properties field? Do we really need it, at all, especially in the context of delivering this "reload" feature?
What exactly is the purpose of the ReadWriteLock? As mentioned, the Properties class is already thread-safe on single property entries. Is the locking to ensure atomicity of the aggregated/bulk reads and writes such as "getSiblings" and "getChildren"? Based on the application's lifecycle workflow, is bulk-operation atomicity a real concern, i.e. do we really need the lock, at all? e.g. like @aberman said, if we were to make a whole new instance of LoggingProvider after the configuration reload, then the bulk atomicity probably doesn't matter because by the time the new provider instance is created, the entire reload of the configuration should have been done already.

pmwmedia commented 1 year ago

Can you (or @pmwmedia) elaborate on why "locking the Configuration speeds things up"?

I can best explain the performance benefit of a frozen configuration through the logger class. Let's imagine we issue many trace log messages that we log in development only, but we disable the output of trace messages in production at all.

public final class Logger {
    private static final LoggingProvider provider = ProviderRegistry.getLoggingProvider();
    private static final boolean MINIMUM_LEVEL_COVERS_TRACE = isCoveredByMinimumLevel(Level.TRACE);

    public static void trace(final Object message) {
        if (MINIMUM_LEVEL_COVERS_TRACE) {
            provider.log(STACKTRACE_DEPTH, null, Level.TRACE, null, null, message, (Object[]) null);
        }
    }
}

Then, the logger loads the configuration and sets MINIMUM_LEVEL_COVERS_TRACE to false in production. The field MINIMUM_LEVEL_COVERS_TRACE is a final boolean. So, the JVM knows that it will be false forever and the logic of the trace() method will never be executed. This means that the JVM can and will optimize the code and remove all calls of all trace() methods. What can be a better performance optimization but not calling the code at all!

What exactly is the purpose of the frozen flag? Is it only to ensure no more writes after any read on the Properties field? Do we really need it, at all, especially in the context of delivering this "reload" feature?

As we cannot reload a changed configuration in tinylog 2, we should fail as early as possible when a user tries it nevertheless.

Based on the application's lifecycle workflow, is bulk-operation atomicity a real concern, i.e. do we really need the lock, at all?

Optimizations are always welcome :)

By the way, tinylog 3 will support reloading the configuration. This feature isn't implemented yet but the architecture of tinylog 3 is already prepared for it.

q3769 commented 1 year ago

Can you (or @pmwmedia) elaborate on why "locking the Configuration speeds things up"?

the logger loads the configuration and sets MINIMUM_LEVEL_COVERS_TRACE to false in production. The field MINIMUM_LEVEL_COVERS_TRACE is a final boolean. So, the JVM knows that it will be false forever and the logic of the trace() method will never be executed. This means that the JVM can and will optimize the code and remove all calls of all trace() methods. What can be a better performance optimization but not calling the code at all!

The immutable nature and performance benefit of a final field is well understood but that is independent of the configuration's mutability. The final boolean field MINIMUM_LEVEL_COVERS_TRACE is read (and set) from the configuration once and only once. The final value won't change even if the configuration changes (not frozen) later. Thus, freezing the configuration after the final field is set will not help or hurt performance. Correct?

By the way, tinylog 3 will support reloading the configuration. This feature isn't implemented yet but the architecture of tinylog 3 is already prepared for it.

Can you please explain or point me to the details of the version 3 design to support the reload? Per my understanding so far:

Besides individual/single configuration property entry's thread-safety, there is no concern on atomicity of bulk operations. In that case, the ReadWriteLock can be removed.
To support reload of configuration, the frozen-after-read semantics need to be removed. E.g. the boolean field MINIMUM_LEVEL_COVERS_TRACE can no longer be final.

pmwmedia commented 1 year ago

Thus, freezing the configuration after the final field is set will not help or hurt performance. Correct?

A new configuration can require different minimum severity levels. This is the main reason why tinylog 2 freezes the configuration to prevent the user from changing the configuration in a way that cannot be applied.

Can you please explain or point me to the details of the version 3 design to support the reload? Per my understanding so far:

The plan for tinylog 3 is to allow the user to define explicitly a minimum severity level for reloaded configuration. The user can set it to trace for having no limitations. However, if the user knows that he or she will never enable some severity levels, a more severe level can be set for benefit from performance improvements.

q3769 commented 1 year ago

A new configuration can require different minimum severity levels. This is the main reason why tinylog 2 freezes the configuration to prevent the user from changing the configuration in a way that cannot be applied.

Thanks for the clarification. That makes sense in that freezing enforces conditional immutability but has nothing to do with performance.

@aberman : In that case, even if I submit a PR to remove the lock, it wouldn't help you because the frozen flag is intentional per the version 2 design and will probably stay until version 3. Meanwhile, you'd still have to use reflection to unset the frozen flag. Would you rather have that, or wait for version 3?

The plan for tinylog 3 is to allow the user to define explicitly a minimum severity level for reloaded configuration. The user can set it to trace for having no limitations. However, if the user knows that he or she will never enable some severity levels, a more severe level can be set for benefit from performance improvements.

Sounds good and interesting.