openhab / openhab-core

Core framework of openHAB
https://www.openhab.org/
Eclipse Public License 2.0
925 stars 424 forks source link

DSL-rules via UI cause "out of memory" errors (java heap space) #2031

Closed spy0r closed 3 years ago

spy0r commented 3 years ago

System (Proxmox VM):

stef@openhab3:~$ uname -a Linux openhab3 4.19.0-13-amd64 openhab/openhab-distro#1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux

Openhab:

3.0.0 - Release Build

Java:

stef@openhab3:~$ java --version openjdk 11.0.9.1 2020-11-04 LTS OpenJDK Runtime Environment Zulu11.43+55-CA (build 11.0.9.1+1-LTS) OpenJDK 64-Bit Server VM Zulu11.43+55-CA (build 11.0.9.1+1-LTS, mixed mode)

Scenario: Migrating DSL-rules from OH2 to OH3 via the UI cause "out of memory issues" (java heap space). One can see, that those (easy) rules run extremely long (up to many seconds - seen in the rule status indicator in the UI or via the java processes top -H -p yourOH3javaProcess)

Solution: To get a stable OH3 instance i needed to put those rules in files again, like in OH2. Same rules, same trigger, no problems. Those rules run that fast then, that you can not even see the "running" state in the rule indicater.

See also https://community.openhab.org/t/openhab-3-openhabian-runs-out-of-memory-java-heap-space-errors-cpu-100-after-a-few-hours

An example rule was this:

rule "Wohnzimmerlicht"
when
    Time cron "0 0/5 * * * ?" or
    Item presence_wohnbereich received update
then
    if (presence_wohnbereich.state == ON) {
        if (now.hour >= 9 && group_light_wohnzimmer.state != ON && (lumen_wohnzimmer.state < 5 || now.isAfter((LokaleSonnendaten_Sonnenuntergang.state as DateTimeType).getZonedDateTime))) {
            sendCommand(group_light_wohnzimmer, ON)
        }
        else if (group_light_wohnzimmer.state == ON && lumen_wohnzimmer.state > 10) {
            sendCommand(group_light_wohnzimmer, OFF)
        }
    }
    else if (group_light_wohnzimmer.state != OFF) sendCommand(group_light_wohnzimmer, OFF)
end
spy0r commented 3 years ago

Here is another example: https://community.openhab.org/t/oh3-high-cpu-load-unresponsive-oh/111158/10

kaikreuzer commented 3 years ago

Could you post a very simple example on how to reproduce? Like taking a fresh OH install, creating a rule that does a log message every second and see that this goes into an OOM after a while?

spy0r commented 3 years ago

I could set up a very small example, which got me a 100% cpu usage after half an hour:

postUpdate(debugItem, now.toString)

Wait and see the CPU load rising... I couldn't get OOM-Errors so far, but the rising CPU load is inevitably causing problems anytime.

image

image

boc-tothefuture commented 3 years ago

I have been building a JRuby based rules engine and during automated testing I have noted that I do periodically get OoM errors. I have not yet done any deep dive to determine root cause. I am sharing here in case the it helps debug in case the OoM may be shared by something common in the script engine implementation rather than the DSL specific script engine.

kaikreuzer commented 3 years ago

Thanks for the report. I have created https://github.com/openhab/openhab-core/pull/2057 for it, it would be great if you could test a snapshot, once that PR is merged and available in the distro!

@boc-tothefuture Thanks for the hint, but I think this issue here is very specific to the DSL script engine. If you see issues with other scripts as well, please try to analyze and create a dedicated issue for it.

spy0r commented 3 years ago

Yes, i will do the testing once it's available.

spy0r commented 3 years ago

I can confirm that this isn't happening anymore in a current snapshot (tested 3.1.0~S2123).

image

Thanks anyone!

kaikreuzer commented 3 years ago

Excellent!

openhab-bot commented 3 years ago

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/system-crash/113899/8

gotling commented 3 years ago

My Pi 2 has been running slugish and I got the out of memory exception. In openhabian-config under 40 - openHAB Related I changed to snapshot build. After that the setup has been snappy.

Is the plan to release a 3.0.1 build including this fix so new users downloading the stable image don't run into this problem?

openhab-bot commented 3 years ago

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/oh3-crashes-with-outofmemoryerror/114934/2

demlstda commented 3 years ago

Hi,

for me the problem is not solved (except I use the .rules files again. The CPU Load for the JAVA process is going through the roof (most of the time > 100%), even if there is enough MEM available. But of course I am not that expert that can see what happens really to the memory. But a screenshot shows the situation short before the OOM Message in the log appears. Machine is getting slower and slower and at the end complete responseless fr a click in the WebUI. I am on Snapshot 3.1 Build #2159. (Raspbery Pi4 8GB)

Bildschirmfoto 2021-01-22 um 11 59 03

I did a lot of testing today for rules which were created and maintained via WebUi. So, entering, rules, saving, running. All the stuff, and at the end the responsiveness decreased....

Hope this input helps (?)

spy0r commented 3 years ago

i did another test run with the example i posted here with the build #2159 you mentioned.

I did not run into any trouble again, maybe you should open another issue and share more information on your specific setup. Maybe you can find a small example to reproduce your behavior to make it easier for the developers to reproduce.

demlstda commented 3 years ago

Which tool (?) do you need to show such a CPU Load Graph?

I will try to get more detailed infos :)

DiKaY1969 commented 3 years ago

Has this been fixed with OH3.0.1?

I only use DSL rules via MainUI.

I did not find it in the 3.0.1 Release Notes or overread it.

daverichter commented 3 years ago

It seems the patch is applied to version 3.0.1, but the problem is in Version 3.0.1 still existing.

johannesbonn commented 3 years ago

I have the also a heap space error in the logs with 3.1 Milestone 1. System is unresponsive and slow.

@kaikreuzer : is this a known problem in the milestone?

openhab-bot commented 3 years ago

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/openhab-3-runs-out-of-memory-java-heap-space-errors-cpu-100-after-a-few-hours/110924/99

bodoweiss commented 3 years ago

This issue is still existing in OH3.0.1. Some users reported it in this thread https://community.openhab.org/t/openhab-3-runs-out-of-memory-java-heap-space-errors-cpu-100-after-a-few-hours/110924/99 Since this ticket is closed, should it be reopened or should a new one raised?

kaikreuzer commented 3 years ago

What was fixed was indeed only the performance issue, but not necessarily a memory issue. Memory should be addressed by https://github.com/openhab/openhab-core/pull/2182, so there's no need to open another issue.

mhilbush commented 3 years ago

@kaikreuzer Please take a look at my post here.

openhab-bot commented 3 years ago

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/requests-always-fail/119667/5