eclipse-archived / smarthome

Eclipse SmartHome™ project
https://www.eclipse.org/smarthome/
Eclipse Public License 2.0
865 stars 782 forks source link

Thread pool 'safeCall' rejected execution of class java.util.concurrent.FutureTask #1575

Closed lolodomo closed 8 years ago

lolodomo commented 8 years ago

Starting OH 2 beta 3 with a rules file, several warning are logged:

2016-05-30 19:49:20.468 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.core.internal.items.ItemUpdater@1feede' takes more than 5000ms.
2016-05-30 19:49:24.026 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.core.internal.items.ItemUpdater@1feede' takes more than 5000ms.
2016-05-30 19:49:25.488 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.core.internal.items.ItemUpdater@1feede' takes more than 5000ms.
2016-05-30 19:49:25.489 [WARN ] [rthome.core.common.ThreadPoolManager] - Thread pool 'safeCall' rejected execution of class java.util.concurrent.FutureTask
2016-05-30 19:49:29.040 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.core.internal.items.ItemUpdater@1feede' takes more than 5000ms.
2016-05-30 19:49:29.041 [WARN ] [rthome.core.common.ThreadPoolManager] - Thread pool 'safeCall' rejected execution of class java.util.concurrent.FutureTask
2016-05-30 19:49:30.491 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.core.thing.internal.ThingManager@e65ff5' takes more than 5000ms.
2016-05-30 19:49:30.492 [WARN ] [rthome.core.common.ThreadPoolManager] - Thread pool 'safeCall' rejected execution of class java.util.concurrent.FutureTask
2016-05-30 19:49:34.042 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.core.thing.internal.ThingManager@e65ff5' takes more than 5000ms.

Is it normal ?

kaikreuzer commented 8 years ago

This sounds as if you have a misbehaving add-on. Could you try to de-install them and see which one is the culprit?

lolodomo commented 8 years ago

I forgot to mention that I got these warnings only when I define a rules file. Without my rules file, these warnings don't appear. Do you still want that I try to de-install add-on one by one ?

kaikreuzer commented 8 years ago

This sounds as if you have long-running/blocking rules then - is that possible?

tulamidan commented 8 years ago

Sounds familiar indeed. When the messages appear in the log file - no rule is executed in my case. So long running or blocking rules should not be the issue.

lolodomo commented 8 years ago

No I don'tthink I have long running rules. I could add log traces to check the duration of each of them. Could it be due to the initial compilation time of each rule ?

tulamidan commented 8 years ago

I did a test yesteday and I triggered a rule with a button. It took a while an reported a lot of the above mentioned errors. But in the end the rule fired. Yet it took really long. about a minute to switch of a couple of sensors and all lights in a group.

On OH1 it takes sub-seconds and I event had to out thread::sleep(300) in it to put less traffic on the bus. I'll try tonight without the sleep.

tulamidan commented 8 years ago

Just now I have tested the latest build and I got the impression that there is a promlem with binding my items to the event bus. (knx) Later the already posted "dispatching takes more than... " messages are written to the log.

07:39:03.369 [INFO ] [smarthome.event.ItemCommandEvent    ] - Item 'sLichtwert' received command 5296
07:39:03.431 [ERROR] [ore.internal.events.OSGiEventManager] - Dispatching/filtering event for subscriber 'org.eclipse.smarthome.core.events.EventSubscriber' failed: java.lang.NullPointerException
java.util.concurrent.ExecutionException: java.lang.NullPointerException
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)[:1.8.0_65]
        at java.util.concurrent.FutureTask.get(FutureTask.java:206)[:1.8.0_65]
        at org.eclipse.smarthome.core.common.SafeMethodCaller.callAsynchronous(SafeMethodCaller.java:179)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.common.SafeMethodCaller.call(SafeMethodCaller.java:72)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.common.SafeMethodCaller.call(SafeMethodCaller.java:56)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.internal.events.OSGiEventManager.dispatchESHEvent(OSGiEventManager.java:188)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.internal.events.OSGiEventManager.handleEvent(OSGiEventManager.java:163)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.internal.events.OSGiEventManager.handleEvent(OSGiEventManager.java:146)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:415)[3:org.apache.karaf.services.eventadmin:4.0.4]
        at org.apache.felix.eventadmin.impl.tasks.HandlerTask.run(HandlerTask.java:90)[3:org.apache.karaf.services.eventadmin:4.0.4]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_65]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_65]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_65]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_65]
        at java.lang.Thread.run(Thread.java:745)[:1.8.0_65]
Caused by: java.lang.NullPointerException
        at org.eclipse.smarthome.model.rule.runtime.internal.engine.RuleContextHelper.getContext(RuleContextHelper.java:64)[125:org.eclipse.smarthome.model.rule.runtime:0.8.0.201605311814]
        at org.eclipse.smarthome.model.rule.runtime.internal.engine.RuleEngineImpl.executeRule(RuleEngineImpl.java:304)[125:org.eclipse.smarthome.model.rule.runtime:0.8.0.201605311814]
        at org.eclipse.smarthome.model.rule.runtime.internal.engine.RuleEngineImpl.executeRules(RuleEngineImpl.java:329)[125:org.eclipse.smarthome.model.rule.runtime:0.8.0.201605311814]
        at org.eclipse.smarthome.model.rule.runtime.internal.engine.RuleEngineImpl.stateChanged(RuleEngineImpl.java:198)[125:org.eclipse.smarthome.model.rule.runtime:0.8.0.201605311814]
        at org.eclipse.smarthome.core.items.GenericItem.notifyListeners(GenericItem.java:220)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.items.GenericItem.setState(GenericItem.java:195)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.internal.items.ItemUpdater.receiveUpdate(ItemUpdater.java:70)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.items.events.AbstractItemEventSubscriber.receive(AbstractItemEventSubscriber.java:44)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.internal.events.OSGiEventManager$1.call(OSGiEventManager.java:192)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.internal.events.OSGiEventManager$1.call(OSGiEventManager.java:1)[100:org.eclipse.smarthome.core:0.8.0.201605311814]
        at org.eclipse.smarthome.core.common.SafeMethodCaller$CallableWrapper.call(SafeMethodCaller.java:170)[100:org.eclipse.smarthome.core:0.8.0.201605311814]

        //********************rule*************************************************************
rule "Daemmerung"

when
    Item sLichtwert changed

then
    //PS im Wohnzimmer hat bei 50Lux geschaltet und es ist Dunkel
    //logInfo("Lichtwert","Lichwert f�r Daemmerung: "+sLichtwert.state)
    if(sLichtwert.state<5000 && Dunkel<1){      
        logInfo("Lichtwert","Daemmerung eingetreten bei:"+sLichtwert.state)
        sendCommand(EG_PS_Esszimmer_sperren,OFF)
        Dunkel=1
        Tageslicht.state=OFF

    }
    //der PS im Wohnzimmer hat bei 60lux geschaltet und es ist wieder hell
    if(sLichtwert.state>6000 && Dunkel>0){
        logInfo("Lichtwert","Daemmerung beendet bei: "+sLichtwert.state)
        //logInfo("Lichtwert","Daemmerung beendet ")
        if((Licht_EG_Wohnzimmer.state <1)&&(RGB_EG_Wohnzimmer.state<1)){
            Dunkel=0
            Tageslicht.state=ON
            sendCommand(EG_PS_Esszimmer_sperren,ON)
        } 

    }
end
tulamidan commented 8 years ago

Or maybe the server is too slow? It is running on a Raspi 2 - like a good share of the OH installations out there

kaikreuzer commented 8 years ago

No, a RasPi2 should be perfectly fine.

For any such problems it would be terribly helpful if you could reproduce it with the demo setup, i.e. simply add a new rule to it that makes the problem appear, so that others can help analyzing it.

Btw, the log suggests that you are using global variables within your rule file - I do not see this mentioned anywhere. Could you remove them and see if it makes a difference?

tulamidan commented 8 years ago

Right, there is a global. So I have created an isolated rule file with only the above mentioned and stripped away the global. The result is good. I still get one warning (takes more than 5000ms) but the rule executes fine. would you say globals are deprecated?

kaikreuzer commented 8 years ago

No, globals are not deprecated - it seems to be a bug regarding their processing then.

The demo setup also contains one - so could you check what is the difference to your case that makes the problem appear?

lolodomo commented 8 years ago

Just one idea: could it be relative to the initial loading (compilation ?) of each rule ?

I added one log message at first instruction and final instruction in each of my rules. Running 1.8.3 server, I can see that the initial run of a rule can take more than 5 seconds on a RPI 2, 6 seconds for example. Next calls are run fast. Note that several rules are called more or less at the same time when you start openHAB (while items are updated). On 2.0 beta 3 server, the logs show that even the first run of a rule is fast. But the warnings appear exactly around this time (before).

Could you increase the timeout to 10000 ms rather than 5000 ms to check it it solves the problem ?

lolodomo commented 8 years ago

If I suppress all my rules that are triggered by "Item xxx changed" and "Item xxx received update", that means all my rules that are systematically run at startup while the items are set by binding startup, the warnings are disappearing. So these warnings are relative to all rules run at startup when items are first set. Maybe the timeout of 5s is too much small for intial loading of rules on a RPI 2 ?

lolodomo commented 8 years ago

Is there a direct link between the log relative to event dispatching and the log relative to the thread pool ?

tulamidan commented 8 years ago

The most obvious difference between the globals in my rules and the demo rule is: Demo uses solely a timer as gobal - I am using a Number item in this particular rule.

lolodomo commented 8 years ago

For information, I am using global variables too.

kaikreuzer commented 8 years ago

I just tried all kinds of global variables and rules on the latest oh2 snapshot, but it all works smoothly for me... Is this only happening on slow computers like the RPi? I am currently testing on my Macbook, might this be a difference?

lolodomo commented 8 years ago

That is not systematic but sometimes after a new OH start my rules are not working at all and I get a message of this kind:

2016-06-07 01:50:48.785 [ERROR] [.script.engine.ScriptExecutionThread] - Error during the execution of rule 'Minimote appui long': An error occured during the script execution: The name 'MiniMoteSceneDeactivated' cannot be resolved to an item or type.

In this case, MiniMoteSceneDeactivated is the rule trigger. Could it be that rules are internally "compiled" before the items are loaded ? That is something I already encountered and reported several months ago.

In general, after a new restart, it is ok. Except this particular case, my rules are working well even if I have always the warning mentioned in my initial post at startup.

There is clearly one or even several issues with rules when starting OH2. But if you have chance, it can be without any impact.

mohsh86 commented 8 years ago

I have a similar issue, running latest OH2 snapshot on RPI3.

Dispatching event to subscriber 'org.eclipse.smarthome.core.internal.items.ItemUpdater@1879618' takes more than 5000ms

One of the rules i defined has a 7 iteration loop for electricity bill calculation (performed up on 2 ArrayLists of type int

The messages appears once i update the rule file containing the loop (even if just adding comment) or altering the .items file (also even if just adding comment).

After having the messages in the log, the updates flood the log, then continues normally, it won't occur again if that loop is triggered, so it isn't the execution, like @lolodomo said, its like the first compilation of the rule.

i can zip my installation and throw it on a dropbox if that helps

lolodomo commented 8 years ago

An additional point that could be interesting. I tried to start OH2 without any rules. When OH2 is well started, then I put my rules file in conf/rules. At this point, I got the same logs. So that means the warnings do not depend on the full starting process but clearly only depends on the rules initial loading. That would mean that the problem should be easily reproducable on any RPI2.

weswitt commented 8 years ago

Sorry if this is bad form to comment after this is closed. I agree that this is an OH2 startup issue and not specific to MQTT. So is there an issue tracking the startup problems and will it be fixed anytime soon?

kaikreuzer commented 8 years ago

@weswitt This issue here is still open, not closed.

lolodomo commented 8 years ago

For information, with snapshot distribution 408 (offline), even if I have problems with bindings not initializing, I noticed that the warnings mentioned in this issue have dissapeared.

kaikreuzer commented 8 years ago

ok, thanks, so I close this.

lolodomo commented 8 years ago

You mai have waited for other feedbacks.

kaikreuzer commented 8 years ago

I can re-open the issue at any time.