m-m-m / util

Mature Modular Meta-Framework
http://m-m-m.sourceforge.net
Apache License 2.0
10 stars 5 forks source link

bug in message formatting in rare cases after boostrapping #260

Open hohwille opened 3 years ago

hohwille commented 3 years ago

There seems to be a bug that we get an exception and end up in this fallback scenario with a somehow broken message: https://github.com/m-m-m/util/blob/f64572fa06062073e3a381163e8aa7fd389d37a9/nls/src/main/java/net/sf/mmm/util/nls/base/AbstractNlsTemplate.java#L59

Interesting is that the same message later works fine and it seems to happen only the first time when that message is created and only in rare cases (if called early on during app startup).

To fix this issue, we would actually need to know the exception that was catched that is unfortunately not logged here. Therefore a first step will be to add logging of the exception here. Once we have the stacktrace, we can try to see if there is some concurrency bug in the initializing/boostrapping code.

hohwille commented 3 years ago

For the record: We have massively simplified the NLS during our Java11 module migration. However, updating from net.sf.m-m-m:mmm-util-nls to io.github.m-m-m:mmm-nls will require some refactoring of all messages. We therefore still provide support for net.sf.m-m-m:mmm-util-nls and can release a new version in that namespace with a fix as soon as we have been able to understand the reason for the error. Being able to reproduce the bug and potentially even debug it or get the stacktrace would make this easy.

hohwille commented 3 years ago

If the exception would come from here (1): https://github.com/m-m-m/util/blob/f64572fa06062073e3a381163e8aa7fd389d37a9/nls/src/main/java/net/sf/mmm/util/nls/base/AbstractNlsTemplate.java#L55 It means that it actually comes from here (method is not overridden anywhere): https://github.com/m-m-m/util/blob/f64572fa06062073e3a381163e8aa7fd389d37a9/nls/src/main/java/net/sf/mmm/util/nls/base/AbstractNlsTemplate.java#L43

There are two possible cases: nlsDependencies is actually null --> NPE (1.1)

OR (1.2):

https://github.com/m-m-m/util/blob/f64572fa06062073e3a381163e8aa7fd389d37a9/nls/src/main/java/net/sf/mmm/util/nls/impl/NlsMessageFormatterFactoryImpl.java#L34 throws the error what technically means it is thrown in this constructor: https://github.com/m-m-m/util/blob/f64572fa06062073e3a381163e8aa7fd389d37a9/nls/src/main/java/net/sf/mmm/util/nls/impl/formatter/NlsMessageFormatterImpl.java#L51 That again would mean that the message is simply invalid according to the syntax what does not seem like the actual case but has to be verified.

The other option is that the exception is thrown here (2): https://github.com/m-m-m/util/blob/f64572fa06062073e3a381163e8aa7fd389d37a9/nls/src/main/java/net/sf/mmm/util/nls/base/AbstractNlsTemplate.java#L56 so it would be thrown somewhere here: https://github.com/m-m-m/util/blob/f64572fa06062073e3a381163e8aa7fd389d37a9/nls/src/main/java/net/sf/mmm/util/nls/impl/formatter/NlsMessageFormatterImpl.java#L78

I would say that the latter option (2) is even more unlikely.

hohwille commented 3 years ago

In case of the NPE (1.1) the null would come from here: https://github.com/m-m-m/util/blob/f64572fa06062073e3a381163e8aa7fd389d37a9/nls/src/main/java/net/sf/mmm/util/nls/impl/NlsMessageImpl.java#L89

What technically leads us here: https://github.com/m-m-m/util/blob/f64572fa06062073e3a381163e8aa7fd389d37a9/nls/src/main/java/net/sf/mmm/util/nls/base/AbstractNlsDependencies.java#L34

I can not see how this method can return null but as there is no synchronization the JVM with its strange memory model and the underlying CPU with potential out of order processing could potentially and hypothetical cause this maybe. It does not make sense to me and if Java would really behave that way it would be a real pitty...

However, the most obvious explanation would be that the actual message format is invalid. As the project reporting this error told that the message is correct and it usually works but only some times immediately after the server startup the error occurrs something really on with concurreny may happen here...