Closed Orfait closed 5 years ago
Weird. I am using this binding. The problem could be inside the binding or in the library we rely on. At my knowledge nothing was changed since a long time. I have no idea where to start.
@clinique for information.
Could it be something specific to Zulu JRE? Did you check with the Oracle JRE which is at my knowledge the only recommended JRE for openHAB?
https://www.openhab.org/docs/installation/#prerequisites
OpenHAB should work with Zulu, Oracle JRE and OpenJDK. But for the last one, docs says we could have compatibility issues.
In the doc, Zulu is the only one which does not have any disadvantage. So, natural choice.
I will try Oracle JRE today.
@lolodomo : no change on my side and for good reason, I quit Free a year ago. So can not help... :(
So I become officially the only maintener of this binding.
Sorry, yes. Maybe I'll start the same for Orange ...
First result : 30% of RAM at the beginning become more than 61% after 3 hours.
So, switch to Oracle JRE does not show progress on RAM usage. I will let a chance for a stabilisation over time. Let's see in 4-5 hours.
I am now reaching 100% of RAM usage.
How should we do to fix this issue ?
And without this binding, you have no increase of memory ?
When I remove it, memory usage is constant.
How do you measure the memory usage ? Which command ?
What things did you setup and with what refresh setting ?
As OpenHAB is running in proxmox (LXC container), I can see memory usage directly in proxmox. I also confirmed that this memory usage is associated to OpenHAB :
Then, I started to disable addons one by one.
things :
Bridge freebox:server:FreeboxServer "Freebox" [ fqdn="mafreebox.freebox.fr:80", useOnlyHttp=true, appToken="xxx", refreshInterval=30 ] {
Thing phone telephone "Téléphone fixe" [ refreshPhoneInterval=10, refreshPhoneCallsInterval=10 ]
Thing net_device iphone1 "iPhone 1" [ macAddress="XXX" ]
Thing net_device iphone2 "iPhone 2" [ macAddress="XXX" ]
Thing net_device mobile1 "Téléphone 1" [ macAddress="XXX" ]
}
items :
String Freebox_Fwversion "Firmware Version" (Freebox) {channel="freebox:server:FreeboxServer:fwversion"}
Number Freebox_Uptime "Server uptime" (Freebox) {channel="freebox:server:FreeboxServer:uptime"}
Switch Freebox_Restarted "Just restarted" (Freebox) {channel="freebox:server:FreeboxServer:restarted"}
Number Freebox_Tempcpum "CPUm Temperature" (Freebox) {channel="freebox:server:FreeboxServer:tempcpum"}
Number Freebox_Tempcpub "CPUb Temperature" (Freebox) {channel="freebox:server:FreeboxServer:tempcpub"}
Number Freebox_TempSwitch "Switch Temperature" (Freebox) {channel="freebox:server:FreeboxServer:tempSwitch"}
Number Freebox_Fanspeed "Fan Speed" (Freebox) {channel="freebox:server:FreeboxServer:fanspeed"}
Switch Freebox_Reboot "Reboot Freebox" (Freebox) {channel="freebox:server:FreeboxServer:reboot"}
Number Freebox_LcdBrightness "Screen Brightness" (Freebox) {channel="freebox:server:FreeboxServer:lcd_brightness"}
Number Freebox_LcdOrientation "Screen Orientation" (Freebox) {channel="freebox:server:FreeboxServer:lcd_orientation"}
Switch Freebox_LcdForced "Forced Orientation" (Freebox) {channel="freebox:server:FreeboxServer:lcd_forced"}
Switch Freebox_WifiStatus "Wifi Enabled" (Freebox) {channel="freebox:server:FreeboxServer:wifi_status"}
Switch Freebox_FtpStatus "FTP Server Enabled" (Freebox) {channel="freebox:server:FreeboxServer:ftp_status"}
Switch Freebox_AirmediaStatus "Air Media Enabled" (Freebox) {channel="freebox:server:FreeboxServer:airmedia_status"}
Switch Freebox_UpnpavStatus "UPnP AV Enabled" (Freebox) {channel="freebox:server:FreeboxServer:upnpav_status"}
Switch Freebox_SambafileshareStatus "Window File Sharing Enabled" (Freebox) {channel="freebox:server:FreeboxServer:sambafileshare_status"}
Switch Freebox_SambaprintershareStatus "Window Printer Sharing Enabled" (Freebox) {channel="freebox:server:FreeboxServer:sambaprintershare_status"}
String Freebox_XdslStatus "xDSL Status" (Freebox) {channel="freebox:server:FreeboxServer:xdsl_status"}
String Freebox_LineStatus "Line Status" (Freebox) {channel="freebox:server:FreeboxServer:line_status"}
String Freebox_Ipv4 "IP Address" (Freebox) {channel="freebox:server:FreeboxServer:ipv4"}
Number Freebox_RateUp "Upload Rate" (Freebox) {channel="freebox:server:FreeboxServer:rate_up"}
Number Freebox_RateDown "Download Rate" (Freebox) {channel="freebox:server:FreeboxServer:rate_down"}
Number Freebox_BytesUp "Uploaded" (Freebox) {channel="freebox:server:FreeboxServer:bytes_up"}
Number Freebox_BytesDown "Downloaded" (Freebox) {channel="freebox:server:FreeboxServer:bytes_down"}
Switch TelephoneFixe_StateOnhook "State onhook" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:state#onhook"}
Switch TelephoneFixe_StateRinging "State ringing" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:state#ringing"}
String TelephoneFixe_AnyCallNumber "Any call Number" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:any#call_Number"}
Number TelephoneFixe_AnyCallDuration "Any call duration" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:any#call_duration"}
DateTime TelephoneFixe_AnyCallTimestamp "Any call timestamp" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:any#call_timestamp"}
String TelephoneFixe_AnyCallStatus "Any call status" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:any#call_status"}
String TelephoneFixe_AnyCallName "Any call name" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:any#call_name"}
String TelephoneFixe_AcceptedCallNumber "Accepted call Number" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:accepted#call_Number"}
Number TelephoneFixe_AcceptedCallDuration "Accepted call duration" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:accepted#call_duration"}
DateTime TelephoneFixe_AcceptedCallTimestamp "Accepted call timestamp" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:accepted#call_timestamp"}
String TelephoneFixe_AcceptedCallName "Accepted call name" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:accepted#call_name"}
String TelephoneFixe_MissedCallNumber "Missed call Number" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:missed#call_Number"}
Number TelephoneFixe_MissedCallDuration "Missed call duration" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:missed#call_duration"}
DateTime TelephoneFixe_MissedCallTimestamp "Missed call timestamp" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:missed#call_timestamp"}
String TelephoneFixe_MissedCallName "Missed call name" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:missed#call_name"}
String TelephoneFixe_OutgoingCallNumber "Outgoing call Number" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:outgoing#call_Number"}
Number TelephoneFixe_OutgoingCallDuration "Outgoing call duration" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:outgoing#call_duration"}
DateTime TelephoneFixe_OutgoingCallTimestamp "Outgoing call timestamp" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:outgoing#call_timestamp"}
String TelephoneFixe_OutgoingCallName "Outgoing call name" (Freebox) {channel="freebox:phone:FreeboxServer:telephone:outgoing#call_name"}
Switch IPhone1 "Reachable" {channel="freebox:net_device:FreeboxServer:iphone1:reachable"}
Switch IPhone2 "Reachable" {channel="freebox:net_device:FreeboxServer:iphone2:reachable"}
Switch Telephone1 "Reachable" {channel="freebox:net_device:FreeboxServer:mobile1:reachable"}
The binding is requesting some objects from the underlying library. Few of these objects contain or are list of objects. I am not sure if I need to clear the lists after using them or if I should let Java garbage collector deal with that ? Maybe @maggu2810 can answer to this question ?
Looking at the code of the binding itself, I cannot find what could lead to a memory leak.
Nice, I can help for testing.
Maybe a problem here in the library ?
https://github.com/MatMaul/freeboxos-java/blob/master/src/org/matmaul/freeboxos/internal/RestManager.java#L165
execute
gets the response form the HTTP request and opens a stream from its content. This stream is then analyzed by readValue
but then the stream is never closed.
Can an expert confirm that the stream has to be closed ?
@Orfait : I will build a fixed version of the binding. Are you enough advanced user to deploy a binding jar ?
I can't build but I can deploy a jar.
Please check if this is better with this version. deleted Don't forget unzipping the file first.
I installed it, let's see in few hours
Edit : memory usage is increasing at same rate as before. 30% -> 50% after 2 hours
I still get the issue : after a night, memory is at 97% (1,9GB) and swap is at 474MB. So I made a heap dump and removed the binding.
Unfortunately, the Memory Analyzer Tool does not run on my computer. SO for now, heap dump is useless.
@lolodomo did you ever analyse a heapdump, there is a good chance that it shows the cause. If needed I can take look in the upcoming days.
@orfait do you have the full heap dump? So the the 1 gb + one. AFAIK it is not enabled by default, but it is the that can help us find the cause.
What is strange is that the binding is not manipulating massive data. I can't understand how it could consume 2 GB of memory after 2 hours, even with your refresh rate for phone set to 10 seconds.
No I never produced or analyzed heapdumps.
@lolodomo : not 2 hours, grow from 500MB to 2GB takes about 10 hours. But yes, that's fast.
@martinvw : strange, the file produced with dev:dump-create is only 2MB. Is it normal ?
That sounds more like a threaddump, see also
AFAIK the command is the same on Linux
What is strange too is that I am using this binding in my production environment (RPI 2) since ages and openHAB is still responding well even after several days.
The architecture of the binding is very simple. A thread is scheduled every XX seconds. It requessts data from the library and use them to set the state of the channels. These data include list of objects. Nothing is done in the binding to release these objects. I was expecting the memory to be released when the thread ends (or at least after Java Garbage Collector is run). The library executes HTTP requests to get the data and builds objects from the JSON response. List of objects are sometimes built. As these data are delivered to the client requesting them through the return of a library public method, these lists are never removed inside the library itself.
PS: in fact, we have 3 different threads that handle different data.
@Orfait : as my fix did not help, it could be better to do your memory dump with the official version of the binding.
I could apply a removeAll or clear to all lists of objects returned by the library once I finished to use them. But is it required ?
Please try with this new version in which I added some list.clear(). org.openhab.binding.freebox-2.4.0-SNAPSHOT.zip I am curious to know if it helps or not.
@Orfait : with your settings, we should have after 10 hours:
That is a total of 22800 HTTP calls. Even if we count 10 KB per call, it leads to 228 MB after 10 hours.
I deployed the new jar.
In fact, I am running this binding for a long time now, also on a raspberry pi (before current setup). One thing : I was running the snapshot version of OpenHAB, then moved to stable 2.3. But before that, I (wrongly) updated to 2.4 snapshot.
I will also check the rules, but it is the same : rules have not changed since a long time...
It is possible that the jar I provided is only working well with openHAB snapshot. I don't know if there was some breaking API in the Eclipse SmartHome core framework since the stable v2.3 of openHAB.
I could apply a removeAll to all lists of objects returned by the library once I finished to use them. But is it required ?
Normally garbage collection should take care of this (it should remove the whole list and it’s elements) maybe clearing the lists could change how long it takes but most likely the problem lies deeper.
A heap dump has the most value without additional cleaning code because it will make the problem easier to spot in the dump.
Note that the library we rely on ( https://github.com/MatMaul/freeboxos-java ) is using these libraries:
Maybe it will not be a bad idea to move to a more recent version of the Apache HTTP components ?
I am not able to do the heap dump... Tried with -F, but this leads to a java exception.
Strange, the dev:dump-create in karaf is not related to heap dump ? I am pretty sure I have read this in the forum.
EDIT : I must run jmap with same user as java process
https://karaf.apache.org/manual/latest-2.x/developers-guide/developer-commands.html
It seems to contains karaf specific diagnostics:
Consumming 1,5 GB in 10 hours, it will mean around 66 KB of data per HTTP call, all this data never released. That is too big. Even if there are few calls that can return a lot of data (list of phone calls or list of network devices), most of them should probably require less than 1 KB).
When I use the top command, I can see the %MEM for my java process is around 32% + this:
KiB Mem: 996452 total, 969404 used, 27048 free, 39780 buffers
KiB Swap: 102396 total, 9216 used, 93180 free. 394224 cached Mem
And the result of "free -m" gives:
total used free shared buffers cached
Mem: 973 946 26 1 38 384
-/+ buffers/cache: 522 450
Swap: 99 9 90
As already discussed in another issue, the used mem displayed by this command is very high but in fact there a lot of cached mem. In my case, 450 MB is in fact available and not only 26 MB.
As discussed here https://github.com/eclipse/smarthome/issues/4490 , the right cvommand to use to know the available RAM is: cat /proc/meminfo | grep '^MemAvailable:' | awk '{print $2/1024}'
which gives 496.832 in my current case.
I just installed again the official version of the binding (not my fixed version). My current value of available memory is 491.309. Will see the value in few hours.
20 minutes later: 490.285
I installed back the original version and removed all other bindings. It is sunny today, we can keep lights off :)
I am waiting until memory is full then make a heap dump.
Available memory (for tracking)
Sun Jul 29 15:11:47 CEST 2018 : 1391.57
Sun Jul 29 15:11:53 CEST 2018 : 1386.55
Sun Jul 29 15:12:59 CEST 2018 : 1381.73
Sun Jul 29 15:14:10 CEST 2018 : 1374.26
Sun Jul 29 15:23:21 CEST 2018 : 1338.6
Sun Jul 29 15:33:48 CEST 2018 : 1303.05
cat /proc/meminfo | grep '^MemAvailable:' | awk '{print $2/1024}'
490.918
top - 15:34:23 up 12 days, 14:50, 1 user, load average: 0,09, 0,19, 0,18
Tasks: 109 total, 1 running, 108 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0,9 us, 0,3 sy, 0,0 ni, 98,8 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
KiB Mem: 996452 total, 963920 used, 32532 free, 39828 buffers
KiB Swap: 102396 total, 9204 used, 93192 free. 383736 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27989 xx 20 0 513492 327504 14620 S 4,0 32,9 365:48.66 java
free -m
total used free shared buffers cached
Mem: 973 941 32 1 38 374
-/+ buffers/cache: 527 445
Swap: 99 8 91
Using the command cat /proc/<pid>/status
I can see that I have between 182 and 187 threads for my java process. Is it expected to have so many threads ?
VmPeak: 513524 kB
VmSize: 513492 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 327756 kB
VmRSS: 327692 kB
VmData: 493928 kB
VmStk: 136 kB
VmExe: 4 kB
VmLib: 9980 kB
VmPTE: 428 kB
VmPMD: 0 kB
VmSwap: 0 kB
Threads: 185
cat /proc/meminfo | grep '^MemAvailable:' | awk '{print $2/1024}'
489.512
top - 17:10:55 up 12 days, 16:26, 1 user, load average: 0,23, 0,21, 0,18
Tasks: 109 total, 1 running, 108 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0,4 us, 0,2 sy, 0,0 ni, 99,3 id, 0,0 wa, 0,0 hi, 0,1 si, 0,0 st
KiB Mem: 996452 total, 967672 used, 28780 free, 40112 buffers
KiB Swap: 102396 total, 9196 used, 93200 free. 384056 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27989 xx 20 0 513492 327536 14620 S 1,6 32,9 383:58.63 java
free -m
total used free shared buffers cached
Mem: 973 944 28 1 39 375
-/+ buffers/cache: 530 442
Swap: 99 8 91
VmPeak: 513524 kB
VmSize: 513492 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 327828 kB
VmRSS: 327628 kB
VmData: 493928 kB
VmStk: 136 kB
VmExe: 4 kB
VmLib: 9980 kB
VmPTE: 428 kB
VmPMD: 0 kB
VmSwap: 0 kB
Threads: 185
It looks very stable at the java process level. The total available memory lost only 2 MB after 4 hours but it might be because of data logs rather than memory leak (even if my openHAB logs and rrdj data are saved on a network share).
At this stage, I doubt there is a memory leak in openHAB. I am running the snapshot 1320 on a RPI 2.
cat /proc/meminfo | grep '^MemAvailable:' | awk '{print $2/1024}'
489.594
top - 20:03:55 up 12 days, 19:19, 1 user, load average: 0,35, 0,26, 0,21
Tasks: 109 total, 1 running, 108 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0,4 us, 0,3 sy, 0,0 ni, 99,0 id, 0,1 wa, 0,0 hi, 0,3 si, 0,0 st
KiB Mem: 996452 total, 961480 used, 34972 free, 40340 buffers
KiB Swap: 102396 total, 9188 used, 93208 free. 377620 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27989 xx 20 0 513492 327544 14620 S 2,6 32,9 417:27.81 java
free -m
total used free shared buffers cached
Mem: 973 942 30 2 39 372
-/+ buffers/cache: 530 442
Swap: 99 8 91
VmPeak: 514052 kB
VmSize: 513492 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 327932 kB
VmRSS: 327608 kB
VmData: 493928 kB
VmStk: 136 kB
VmExe: 4 kB
VmLib: 9980 kB
VmPTE: 428 kB
VmPMD: 0 kB
VmSwap: 0 kB
Threads: 190
Still stable. Just a little more threads.
I restarted after re-adding other bindings (need to have lights in the evening... After 5 hours :
cat /proc/meminfo | grep '^MemAvailable:' | awk '{print $2/1024}'
390.324
top - 21:22:08 up 4:51, 1 user, load average: 0.01, 0.07, 0.17
Tasks: 24 total, 1 running, 23 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.2 us, 0.5 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2097152 total, 190752 free, 1693592 used, 212808 buff/cache
KiB Swap: 3145728 total, 3145728 free, 0 used. 403560 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
136 openhab 20 0 6013528 1.474g 19748 S 2.0 73.7 12:21.02 java
free -m
total used free shared buff/cache available
Mem: 2048 1657 182 81 207 390
Swap: 3072 0 3072
cat /proc/136/status
Name: java
Umask: 0022
State: S (sleeping)
Tgid: 136
Ngid: 0
Pid: 136
PPid: 1
TracerPid: 0
Uid: 108 108 108 108
Gid: 114 114 114 114
FDSize: 512
Groups: 114
NStgid: 136
NSpid: 136
NSpgid: 136
NSsid: 136
VmPeak: 6013536 kB
VmSize: 6013528 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 1554852 kB
VmRSS: 1554784 kB
RssAnon: 1535036 kB
RssFile: 19748 kB
RssShmem: 0 kB
VmData: 1961980 kB
VmStk: 132 kB
VmExe: 4 kB
VmLib: 17984 kB
VmPTE: 4084 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
Threads: 348
SigQ: 0/31004
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 2000000181005ccf
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003cfdfcffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 2
Speculation_Store_Bypass: vulnerable
Cpus_allowed: 5
Cpus_allowed_list: 0,2
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 60
nonvoluntary_ctxt_switches: 13
That is weird that your Java process required 6 GB of memory !!! How openHAB could require so much memory while mine only requires 500 MB ? Did you change the default startup settings ? Maybe you have a problem with one of your bindings but not Freebox.
I am experiencing a probable memory leak in openhab, RAM is at almost 100% (2GB) in few hours and begins to use swap (500MB).
After some tests, I discovered that I can "fix" this memory issue by disabling the Freebox binding.
My setup is :
Ask me for more info if needed.