eclipse-archived / smarthome

Eclipse SmartHome™ project
https://www.eclipse.org/smarthome/
Eclipse Public License 2.0
863 stars 782 forks source link

Improve event bus threading #600

Closed kaikreuzer closed 6 years ago

kaikreuzer commented 8 years ago

migrated from Bugzilla #423517 status ASSIGNED severity enhancement in component Core for --- Reported in version unspecified on platform All Assigned to: Davy Vanherbergen

On 2013-12-07 17:21:38 -0500, Kai Kreuzer wrote:

Migrated from https://code.google.com/p/openhab/issues/detail?id=454

On 2014-05-23 18:00:49 -0400, Davy Vanherbergen wrote:

What happened to the EventPublisherImpl in SmartHome? I don't understand why all the synchronized blocks have been added. Doesn't this limit the concurrency of the eventbus to 1, whilst the EventAdmin is threadsafe and can handle concurrency? Wouldn't it make more sense to remove all of the synchronized keywords and eventAdmin null checks and change the osgi binding policy for the EventAdmin to static, that way the EventPublisher will only be available if there is an eventAdmin already set..

On 2014-05-26 03:01:55 -0400, Kai Kreuzer wrote:

The synchronized blocks were added by a contribution that tried to cover every potential problem with dynamic behaviour - but it was indeed much too restrictive and thus causes problems like this: https://bugs.eclipse.org/bugs/show_bug.cgi?id=435294. So the synchronized blocks will have to be removed again indeed.

On 2014-05-26 12:07:18 -0400, Davy Vanherbergen wrote:

I would like to propose 2 changes to the way the current event bus is implemented/used.

1) Change event bus implementation to something like: EventBus.java : Interface defining event bus. Defines methods for subscribers to (de)register themselves for events and for publishers to post events. EventBusImpl.java : default event bus implementation. This would be a custom implementation (not using EventAdmin) which is a multi threaded distributor of events. It would be limited to local (same jvm) events only. EventBusMQTT.java : an MQTT based event bus implementation. This could be an optional implementation in a separate bundle. Similar to the above, but using MQTT for delivery, which would make this very suitable to connect remote nodes. I believe that moving away from the EventAdmin would be the right way forward. It will help avoid the concurrency issues we are currently seeing with openHAB events and the OSGI service loading.

2) Change event hierarchy: I believe it would be useful to split events into more different types: StateEvent, CommandEvent, ConfigurationEvent, SystemEvent. ConfigurationEvents could be used for notifying changes in item/binding configuration and SystemEvents could be used for system related events like BINDING_ADDED, BINDING_REMOVED, BINDING_STARTED, NODE_STARTED, SYSTEM_STARTED, etc. Without system events, it will be very difficult to be able to know when the full SmartHome (master + nodes) is started and the startup rules can be triggered. Configuration events on the other hand, would be very usefull to configure items in remote nodes from the master node.

Would this fit within the SmartHome long term direction?

On 2014-05-27 09:25:56 -0400, Kai Kreuzer wrote:

1) Right, from our experience with the different EventAdmin implementations it is difficult to get a reliable behavior across different setups. So I would agree to move away from EventAdmin. Is the assumption that there is always exactly one EventBus implementation registered as an OSGi service? Or would there be any use case where multiple buses need to be supported? For the non-MQTT implementation, the Google Guava AsyncEventBus might actually be a good choice, what do you think? Could you come up with a suggestion for the EventBus interface? What features shall we put in there? Should it be more oriented towards Java objects like Guava or more towards topic structures like MQTT and EventAdmin? Furthermore, shall it support MQTT-like features such as QoS or last-will?

2) Agreed, we need to introduce new types of events (see also the "UI events" suggestion in https://www.eclipse.org/forums/index.php/mv/msg/628353/1376618/#msg_1376618). I wonder if the requirements for these "system/ui/config" events might be different than the ones for "item" events and if this could qualify for different interfaces? Probably not as it would make things rather complex. What is your take on this?

I would suggest to only address 1) in the context of this issue and create a second bugzilla issue for introducing new event categories.

On 2014-05-27 09:54:29 -0400, Davy Vanherbergen wrote:

1) I can't think of any use case where there would be a need for 2 eventbuses in a single runtime. At first sight the guava bus looks nice, but I'm not sure if it would make it more complicated to have interchangeable buses. I'll need to have a closer look at the Guava AsyncEventBus first. My initial idea was to keep the eventbus as simple as possible and orientate more towards Java objects. I haven't thought about QoS or last-will yet. I'm not quite sure yet what use cases for that would be. I'll see if I can come up with something more concrete for the EventBus.

2) Sounds like a plan.

On 2014-06-02 09:01:41 -0400, Michael Grammling wrote:

(In reply to Davy Vanherbergen from comment # 1)

What happened to the EventPublisherImpl in SmartHome? I don't understand why all the synchronized blocks have been added. Doesn't this limit the concurrency of the eventbus to 1, whilst the EventAdmin is threadsafe and can handle concurrency? Wouldn't it make more sense to remove all of the synchronized keywords and eventAdmin null checks and change the osgi binding policy for the EventAdmin to static, that way the EventPublisher will only be available if there is an eventAdmin already set..

The synchronization stuff is neither related to the synchronization of the EventAdmin implementation itself nor the null-check. It's related accessing the EventAdmin. Nevertheless I understand the problem and the deadlock situation.

Example (without synchronization): Thread 1: T1 Thread 2: T2

1.) T1: synchronous call by DS of OSGi to unsetEventAdmin(...) 2.) T2: e.g. postCommand(...), if (this.eventAdmin != null) -> true 3.) T1: unsetEventAdmin: this.eventAdmin = null; 4.) T2: this.eventAdmin.sendEvent(createCommandEvent(itemName, command, source)); 5.) NPE

We can now improve the code a bit to: EventAdmin localEventAdmin = this.eventAdmin; if (localEventAdmin != null) { localEventAdmin.sendEvent(createCommandEvent(itemName, command, source)); } else { throw new IllegalStateException("The event bus module is not available!"); }

1.) T1: synchronous call by DS of OSGi to unsetEventAdmin(...) 2.) T2: e.g. postCommand(...), if (localEventAdmin != null) -> true 3.) T1: unsetEventAdmin: this.eventAdmin = null; 4.) T2: localEventAdmin.sendEvent(createCommandEvent(itemName, command, source)); 5.) Undefined behaviour (e.g. no error or an IllegalStateException if it's a good implementation or a JVM crash, ...)

The synchronization is needed due to the dynamic behaviour of OSGi. The static feature will not help, because in that case you would have to synchronize against activate/deactivate instead of setEventAdmin and unsetEventAdmin. All good OSGi service implementations handle such issues. I will think about it which is the best synchronization mechanism for this use case (probably using a "localEventAdmin" and a try/catch block instead of using a synchronization block).

On 2014-06-26 11:03:14 -0400, Kai Kreuzer wrote:

FTR: The deadlock has been fixed with https://bugs.eclipse.org/bugs/show_bug.cgi?id=435294.

@Davy: So are you looking into "the plan" from above? I hope so and hence I assign this issue to you, if ok.

On 2014-06-26 16:10:07 -0400, Davy Vanherbergen wrote:

Hi Kai,

Yes, I started working on it. Expect to see something soon..

Davy

On 2014-06-28 13:13:15 -0400, Davy Vanherbergen wrote:

I took a stab at creating a new EventBus. The result can be viewed here: https://github.com/dvanherbergen/smarthome/compare/threading

The implementation is a simple one which doesn't stray very far from the previous implementation. To subscribe to an event you have to implement the appropriate interface and publish it via DS. There is currently a different interface to implement per event type. Publishing has stayed almost the same as well, you inject the eventBus via DS and use it to publish events.

I wanted to do some fancy annotation based event bus, but I couldn't convince myself that it was the right way to go. Using annotations only makes sense to me, if either one of the following would be true: 1) There would be many event types to support, which would lead to a large number of interfaces 2) There is a need to provide complex filtering methods to subscribers to only receive a subset of certain event types 3) The architectural direction of SmartHome is to move away from Declarative Services and to move towards annotations for osgi services (e.g. using OSGI 5 @component or peaberry)

For the first point, I can only think of 4 event types that would be needed for SmartHome (state, commmand, system, configuration), which is not that much. With regards to the second point, currently the filtering is done in the subscriber and this keeps things rather straight forward, so I'm not sure if we need a change here. The third point is really about consistency. If bindings already have to use DS anyway, it seems logical to also use DS for the EventBus. Maybe you can comment on the long term direction?

So far, the code hasn't been fully tested yet as the demo config seems to have disappeared ;-) With the proposed implementation, it will be easy to implement an MQTT based one. I'll start on that once the default bus approach is 'approved'.

Note that I've also included a new thread pool service providing two thread pools, one for executing tasks in a separate thread and one for scheduling repeating background tasks. If these thread pools are used consistently within openHAB/SmartHome, it will make it easier to tune the number of concurrent threads used by the application.

Looking forward to your point of view. I'm happy to change the implementation to annotations if it makes sense..

On 2014-06-30 01:03:12 -0400, Davy Vanherbergen wrote:

With regards to 3), the Weld CDI annotations (RFC 193) look like a good candidate to replace Declarative Services. It would keep things nice and clean and has the added advantage that J2EE developers will feel right at home without having to know Declarative Services. I would be in favor of such a move :-)

On 2014-06-30 06:11:48 -0400, Kai Kreuzer wrote:

Great, thanks, Davy! I will have a closer look at the code soon.

Wrt 3): I never liked the overhead of the DS xmls very much, so annotations were always on my mind - I just didn't find a suitable rfc-193 compliant implementation in the past that could be used as a replacement for DS. Weld looks interesting - as long as the one bundle with 1.5MB is enough to fully use it :-)

So in general, I am open to replace DS by CDI, but this is a major step that should be carefully tested evaluated. Good thing of DS is that there are many alternative implementations available and products based on ESH can choose one. So if we require a CDI implementation instead, things might become more complicated.

Anyhow, the main architecture approach for "addon" development is to hide as much OSGi specifics (such as DS) from the developers as possible. So the idea is to provide maven archetypes that generate required DS xmls for you and that you don't really have to deal with when implementing your logic - see e.g. https://github.com/eclipse/smarthome/tree/master/binding/org.eclipse.smarthome.binding.hue

Regarding the lacking demo configurations: Well, this should motivate people to write according OSGi unit tests :-) Besides this, I am currently working on an openHAB2 IDE, where these developments can be tested (with ESH bundles in the same workspace).

On 2014-06-30 08:35:54 -0400, Davy Vanherbergen wrote:

There is also the PAX CDI implementation (https://ops4j1.jira.com/wiki/display/PAXCDI/Pax+CDI), maybe that one is smaller.

Another alternative to DS in "addon" development could be to build custom annotations, e.g. @ThingHandlerFactory, @ThingTypeProvider, etc. Then there could be a bundle in Smarthome which would be responsible for registering these classes as services when the bundles are loaded. Sort of like a SmartHome specific mini-CDI implementation. Or it would also be possible to generate DS files from these annotations during the compile/build phase. In both cases, it would result in some 'reinventing the wheel' of things already done in BND or WeldCDI, but it would be a loosely coupled solution for addons that doesn't require direct use of DS and no dependencies on specific CDI/Build implementations other than what is in SmartHome.

On 2014-06-30 12:02:54 -0400, Kai Kreuzer wrote:

Are you sure that PAX CDI is an implementation? At a first glance it seemed to be just a CDI API packaging using weld as well... We also through about custom annotations for our own purposes, but didn't really feel good about developing our own DI solution - so we decided to stay with DS and rather try to hide it. One thing to keep in mind about annotation vs. XML: For annotations you need to scan every single Java class at bundle startup, while for XMLs you just need to check a certain folder. On low-end systems like a RaspPi that can sum up to a substantial load at startup time. Remember that CDI comes from the EE spec, i.e. solutions that usually run on huge servers.

On 2014-06-30 14:20:09 -0400, Davy Vanherbergen wrote:

Looks like PAX-CDI is indeed not an implementation..

On 2015-04-27 10:45:33 -0400, Dennis Nobel wrote:

Dany, are you still working on this issue?

On 2015-04-28 04:12:34 -0400, Davy Vanherbergen wrote:

Nope, not actively working on this anymore. The code was written but never reviewed. You can have a look at it here: https://github.com/dvanherbergen/smarthome/compare/threading

On 2015-04-30 09:17:00 -0400, Kai Kreuzer wrote:

The code was written but never reviewed.

Hm, this makes me feel pretty awkward now. I actually reviewed the code and liked it, but we never moved over to a PR because I somehow got lost in the "annotation" discussion... Having said this, I am still interested in solving this issue and moving away from the event admin like you suggested. One part of our discussion was about supporting additional new event types as well - this is actually a different issue, but one that we currently address (in a more generic and extensible way than in your prototype) - you will find some details here: https://www.eclipse.org/forums/index.php?t=msg&th=1066185&goto=1694087&#msg_1694087

Once this is implemented and merged, I think we could continue on the threading issue - provided that you are still interested at all.

On 2015-04-30 10:57:53 -0400, Davy Vanherbergen wrote:

Hi Kai, sorry, I never meant to make you feel awkward. I lost track of the issue myself too. The new event types look great. There is a lot going on at work at the moment, so the next weeks are really busy for me and I'm not sure when I will have time, but give me a shout when it is merged.

On 2015-04-30 11:08:32 -0400, Kai Kreuzer wrote:

Great, no worries, there is no hurry. Will keep you posted!

kaikreuzer commented 6 years ago

We have meanwhile our own OSGiEventManager, which extensively takes care of dispatching events to different threads - so this issue can be closed.