Closed spacemanspiff2007 closed 3 years ago
So do I get you correctly that you identified the REST API/SSE to be a bottleneck? Or is it the item event processing itself? Did you test that independently?
Unfortunately I can't pinpoint where the problem is since this is an end to end test. I am using the Rest API to access openhab and that is why I created the issue with the corresponding tag. The root cause for the high CPU usage and long processing time might be somewhere else, but I do not know how to test that.
Try with an API token created with openhab:users addApiToken
or http://localhost:8080/createApiToken instead of Basic authentication. The password hashing performs a lot of iterations because it was supposed to be a rather infrequent operation, Checking a token is more efficient and they should be preferred over passwords for bulk operations.
Or stop the REST auth bundle with bundle:stop org.openhab.core.io.rest.auth
and you will get the unsecured API back like in OH2.
I disabled the auth bundle with bundle:stop org.openhab.core.io.rest.auth
and things look much better.
Item creation is still 40% slower, but value update is on par with OH2.
Updated 300 item definitions in 3.303s --> 90.821 updates per sec
Benchmark duration: 2.042s --> 146.906 updates per sec
Pings (6): min: 24.0 max: 53.0 median: 25.0
Benchmark duration: 1.966s --> 152.585 updates per sec
Pings (6): min: 23.0 max: 25.0 median: 25.0
Benchmark duration: 2.048s --> 146.476 updates per sec
Pings (6): min: 22.0 max: 26.0 median: 25.0
@ghys : Would it be possible to cache the last ten or so auth headers so it is just a simple lookup? Since the header doesn't change it can be directly linked to the user and role.
Edit: How would I disable the auth bundle in the services.cfg?
Would it be possible to cache the last ten or so auth headers so it is just a simple lookup?
I suppose but since the basic auth headers have passwords in clear text, it's normally good practice not to keep them in memory too long, You retain the password strictly for the time it's necessary to hash it and check it against what you have on file.
I can't answer your other question, but I think you could modify
runtime/system/org/openhab/distro/distro/3.0.0-SNAPSHOT/distro-3.0.0-SNAPSHOT-features.xml
and remove the openhab-core-io-rest-auth
feature from the openhab-runtime-base
feature.
I suppose but since the basic auth headers have passwords in clear text, it's normally good practice not to keep them in memory too long,
It could be cleared a minute after the last successful request, that way the password doesn't get kept in memory too long.
@kaikreuzer Is there any way this could be tackled for RC2? I'd like to perform further tests concerning the performance and a fix for the release would be too late (imho). What do you think?
I don't know, what's the suggestion here? If somebody has an improvement and creates a PR for it, it might still be possible to get it into the release.
I don't know, what's the suggestion here?
Cache a successful user/role lookup so hashing has not to be done every time a request gets processed.
A quick workaround would also be just a simple switch to disable auth altogether.
Since this already has a significant performance impact on my main machine I am concerned that this issue might effectively break the possibility to use HABApp on small embedded devices.
bundle:stop org.openhab.core.io.rest.auth
works but is not persisted across restarts of openhab.
Any ideas what might be a good workaround?
@kaikreuzer I have the first users reporting that this - as I suspected - actually breaks their installation. Things that used to take 100-200ms now take 10-15 seconds rendering openhab effectively useless. Also many users want to turn authentication off because they run a reverse proxy anyway and do authentication there so the auth from openhab is giving them additional problems.
As a quick fix I suggest creating an option to disable auth and to switch to basic auth in the config files (e.g. runtime.cfg) Do you have any idea whom I could ask to implement this?
Traefik is required to use basic auth, too
Edit: New issue #2038 appeared
I can confirm that the android app is much slower since I upgraded my openhab server to version 3.
Also, I can confirm that it was as fast as it used to be, after I switched authentication off using bundle:stop org.openhab.core.io.rest.auth
as suggested.
However, with authentication disabled, I was no longer able to access the settings in the main UI because the login was no longer working. It only displayed a grey screen after entering username and password. After restart, when authentication is activated again, I can access the settings again but the app is slow again.
Long story short, when an option to deactivate authentication is provided, please have a look at the main UI and make sure that settings can still be accessed with disabled authentication.
I think it would be enough to clear the cookies to get the main UI working again but I am not sure. Currently I am trying to create a fix
Hi! Just wanted to leave a note here as I experience problems with slow item loading and a forum user referenced this issue in https://community.openhab.org/t/oh3-item-list-model-long-load-times/112726 I have already tried some benchmarking by firing REST GET requests from a python script as I suspected that only a few items are the culprit... I couldn't pinpoint it down to a thing or binding though.
Is the problem here that main UI makes many many many requests that all have to go through authentication? And if so, is high volume requests also why basic auth through NGINX is now totally unusable? The CPU bogs down and it is essentially unusable. Even if you disable Rest API Auth in the main UI, would this not still be a problem going through NGINX since the requests need to get authenticated there too?
If NGINX is the recommended method to secure outside access, this high volume of requests by main UI makes it totally unusable.
I ran bundle:stop org.openhab.core.io.rest.auth
as suggested. Connections through NGINX basic auth flew right in without a problem. And connections from the LAN work fine too. BUT, you cannot login to main UI as an administrator. Once you login, it just goes to a grey empty screen of nothing. Clearing browser cache/data doesn't fix that. Incognito windows don't fix it. Nothing I've tries seems to get around that.
So it fixes one problem and creates another.
I have REST API Auth turned off, and got remote access through nginx and main UI admin to work with this config: (plain http port 34000 w/basic auth)
omr@shs2:~$ cat /etc/nginx/sites-available/openhab
##
# You should look at the following URL's in order to grasp a solid understanding
# of Nginx configuration files in order to fully unleash the power of Nginx.
# https://www.nginx.com/resources/wiki/start/
# https://www.nginx.com/resources/wiki/start/topics/tutorials/config_pitfalls/
# https://wiki.debian.org/Nginx/DirectoryStructure
#
# In most cases, administrators will remove this file from sites-enabled/ and
# leave it as reference inside of sites-available where it will continue to be
# updated by the nginx packaging team.
#
# This file will automatically load configuration files provided by other
# applications, such as Drupal or Wordpress. These applications will be made
# available underneath a path with that package name, such as /drupal8.
#
# Please see /usr/share/doc/nginx-doc/examples/ for more detailed examples.
##
# Default server configuration
#
server {
listen 34000;
root /var/www/html;
# Add index.php to the list if you are using PHP
index index.html index.htm index.nginx-debian.html;
server_name xxxxx.com;
add_header Set-Cookie X-OPENHAB-AUTH-HEADER=1;
location / {
# First attempt to serve request as file, then
# as directory, then fall back to displaying a 404.
#try_files $uri $uri/ =404;
proxy_pass http://localhost:8080/;
proxy_buffering off;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Authorization "";
auth_basic "Username and Password Required";
auth_basic_user_file /etc/nginx/.htpasswd;
}
}
omr@shs2:~$
I should point out that with Rest Auth turned off, I wasn't able to log in as an administrator even when not going through NGINX. Like just going right into OH locally on 8080, trying to login as an admin on Main UI goes a grey screen of nothing.
Strange. I have no problems with that (OH3.0.0 release). But, I took the update path straight from 2.5.11.
Ah ha. Before I disabled rest auth with bundle:stop org.openhab.core.io.rest.auth
, I did have that allow basic authentication setting turned on. Maybe I should try turning rest auth back on, logging in, turning that setting off, then disabling rest auth again.
I do not stop org.openhab.core.io.rest.auth. As soon as I do, I'm not able to log in, just a blank page.
All I had to do was add these two lines to my /etc/nginx/sites-available/openhab, and restart nginx.
Both remote main UI and Android App login works.
add_header Set-Cookie X-OPENHAB-AUTH-HEADER=1;
proxy_set_header Authorization "";
For me these two entries change nothing in the time of response. Still it's faster with disabled rest auth bundle.
Ah, OK. My problem was not about performance, but the fact that the Android App did not connect at all from WAN side. Sorry about misleading you.
Intetestingly, switching the same switch is slow using the sitemap within the android app, as discussed, but fast using HabPanel, also from the android app. Are they using different interfaces for communication?
@Pedals2Paddles I am having the same problem and created an issue: #2094
Is anyone working on fixing this?
I am trying something in #2030 but I am really struggling. Any help would be appreciated.
I have a fix prepared in #2101
This issue has been mentioned on openHAB Community. There might be relevant details there:
https://community.openhab.org/t/oh3-with-nginx-reverse-proxy-and-authentication/106528/43
Hey guys. Looks like a cause and solution has been identified. See this thread starting here. There is a misplaced directive in the NGINX configuration file causing this mess. Once you move the directive down to the location/ block where it belongs, everything works beautifully.
My configuration had the proxy_set_header Authorization ""
directive up in the top section, not under the location/ block. This is where the OpenHabian-config tool put it twice, and where it seems to have put it for everyone else having this problem. It is also not in the documentation on the web (which I know needs to be updated) so unlikely anyone would know or notice something is out of place.
I moved the directive down to the location/ block as suggested by ysc and mstormi. Everything now seems to operate with lightning efficiency. The CPU is no longer melting down when trying to login. It actually seems just as fast as when connected locally not through NGINX. So this certainly seems to have been the fix we’ve all been looking for.
If this takes care of it for @spacemanspiff2007 and the others having this issue, I believe the issues on this matter and the PR can be closed with thanks.
I am not using NGINX, so I do not see how this could fix it for me. Just an openhab server on a Raspberry and the android app with local server connection. But still I have a very slow connection from the sitemap on the app. HabPanel, however, is fast as it used to be in OH2.5.
Using bundle:stop org.openhab.core.io.rest.auth
solves the performance problem, but only until the next restart. And it breaks the config dialog (See above).
I am not using NGINX but a bare metal OH installation so unfortunately this does not fix anything.
What is the difference between sitemap (slow) and HabPanel (fast) in the android app in terms of communication to the server? Maybe this difference can help find a solution?
@woklei Do you use basic auth for locally accessing openHAB with the Android app? Or is there a problem even without authentication?
I use BasicAuth. Username and password.
Is there any way to use an API Token with the app? I heard this is supposed to be faster...
No, API tokens are not yet supported by the app. But As this issue is about comparison to OH2: There wasn't any basic auth in place at all, so it feels like comparing apples with pears? To get the same performance as with OH2, simply disable basic auth for the local access.
I was seeing slow performance with the android app and OH3 on the local network. I originally had OpenHAB credentials entered in the app for the local connection, but removing these credentials also solved the slow performance. I've not dug into it, but assume the app stops sending an auth header, so OH3 falls back to the unauthenticated access which I think still gives access to the sitemaps/items but not admin areas.
As simple as that. I did not know I could use the app without authentication. I had the username and password entered even before I migrated to OH3. I never knew that this was optional. I removed the credentials from the app and performance is fine again. However, there are some questions remaining now. Maybe you can bring some light in there for me:
Does the app provide any features which the server requires authentication for?
what did the app and server do with username and password in the OH2.5 world?
Ignored it. It was only relevant if you used a reverse proxy with auth even for the local network.
what is authentication in the app used for
The app hasn't been changed (yet) wrt authentication options. So for local access, there still should be no need to configure username/pwd.
Does the app provide any features which the server requires authentication for?
Afaik no. It uses the item & sitemap endpoints, which are not considered "admin" endpoints and are thus not secured by default.
Ok, thank you for the information. So I will happily use it without authentication. :-)
Yes, keep in mind you have two types of accounts: A) the account to the proxy/cloud service (like NGINX or openHAB Cloud/myopenhab.org) B) the account to identify yourself to openHAB itself and get an administrator role to make changes to the instance. These are completely different. B) didn't exist in OH2 so you had a Basic auth option in the apps for A) only.
The way it should work now (represented are connections and HTTP headers in the requests) when you're using a proxy service is:
Client <==============> Reverse proxy service <==================> openHAB
Authorization: Basic <A>
X-OPENHAB-TOKEN: <B> X-OPENHAB-TOKEN: <B>
First note how the Authorization header is NOT passed along in the request the reverse proxy makes to openHAB, that's what the proxy_set_header Authorization ""
directive in NGINX is for when it's configured properly (openHAB Cloud does it properly AFAIK). If it got it wrong, chances are, these Basic credentials are subject to be considered by openHAB too (which would not succeed but still take time to validate for reasons I went out of my way to explain here).
Second, the most important thing to prevent slowness is ensuring password credentials will never be considered for a request to the API, that's why you should NEVER EVER enable Basic auth in openHAB API Security settings if you don't know the implications of doing so. You have both OAuth access tokens and self-created API tokens at your disposal which cover all use cases.
Hope that clears things up.
So it seems that the issue of @Pedals2Paddles (NGINX use) has been solved as well as that of @woklei (local app use). In summary, we see that we have exactly the same performance as OH2, so I'd claim that this issue can be closed.
@kaikreuzer I don't see how you come to the conclusion that they have the same performance when none of the above provided any numbers run by a reproducible benchmark.
These two are using tokens and not basic auth anymore so while this may improve things for them the issue with basic auth (the initial) is not fixed. See the above paragraph:
Second, the most important thing to prevent slowness is ensuring password credentials will never be considered for a request to the API, that's why you should NEVER EVER enable Basic auth in openHAB API Security settings if you don't know the implications of doing so.
This translates to: "We have an option but it's super broken so be sure to never enable it".
I'd like to close this issue as much as you do, but for that we should actually fix it and not just wishing that it would magically disappear.
Here are the numbers. You can run the benchmark yourself if you download HABApp and start it with the --benchmark
argument.
With this pr we see a > 20x improvement in performance and a >22x improvement in response time.
Current implementation
Bench item operations ... done!
| dur | per sec | median | min | max | mean
create item | 43.4s | 6.907 | 0.144s | 0.140s | 0.188s | 0.145s
update item | 43.3s | 6.931 | 0.144s | 0.140s | 0.198s | 0.144s
delete item | 42.7s | 7.029 | 0.141s | 0.138s | 0.195s | 0.142s
Bench item state update .... done!
| dur | per sec | median | min | max | mean
rtt idle | 5.150s | 6.990 | 0.140s | 0.139s | 0.180s | 0.143s
async rtt idle | 5.026s | 6.963 | 0.141s | 0.141s | 0.196s | 0.144s
rtt load (+10x) | 5.022s | 4.181 | 0.236s | 0.213s | 0.284s | 0.239s
async rtt load (+10x) | 5.070s | 4.142 | 0.244s | 0.224s | 0.250s | 0.241s
Cleanup ... complete
With applied fix from PR
Bench item operations ... done!
| dur | per sec | median | min | max | mean
create item | 2.557s | 117.318 | 8.00ms | 7.00ms | 20.0ms | 8.52ms
update item | 2.425s | 123.704 | 8.00ms | 6.00ms | 33.0ms | 8.08ms
delete item | 1.807s | 166.012 | 6.00ms | 5.00ms | 26.0ms | 6.02ms
Bench item state update .... done!
| dur | per sec | median | min | max | mean
rtt idle | 4.994s | 149.971 | 7.00ms | 6.00ms | 12.0ms | 6.67ms
async rtt idle | 5.003s | 133.712 | 7.00ms | 7.00ms | 14.0ms | 7.48ms
rtt load (+10x) | 5.048s | 17.828 | 56.0ms | 46.0ms | 69.0ms | 56.1ms
async rtt load (+10x) | 5.021s | 12.148 | 82.0ms | 73.0ms | 95.0ms | 82.3ms
Cleanup ... complete
@kaikreuzer What do I have to do to convince you that the current implementation does not work properly? Do you need more numbers? You can run the numbers yourself if you don't believe me!
Could you post the benchmark results without tokens or basic auth to really compare it against OH2 (afair, OH2 didn't support the one or the other)?
Could you post the benchmark results without tokens or basic auth to really compare it against OH2 (afair, OH2 didn't support the one or the other)?
Sorry I don't understand - can you elaborate a little bit more what I should do? OH3 needs a token or a basic auth to process REST requests. If I don't supply either of those I get http status 401 "Unauthorized".
Edit: Note that I ran the tests above from the eclipse IDE, the following test was run with 3.0.0 from the command line because I don't know how to disable the auth bundle in eclipse IDE. The auth bundle has not been loaded since it has been blacklisted so this should be comparable to OH2
Bench item operations ... done!
| dur | per sec | median | min | max | mean
create item | 3.042s | 98.614 | 9.00ms | 8.00ms | 28.0ms | 10.1ms
update item | 3.289s | 91.208 | 9.00ms | 8.00ms | 26.0ms | 11.0ms
delete item | 1.831s | 163.836 | 6.00ms | 5.00ms | 25.0ms | 6.10ms
Bench item state update .... done!
| dur | per sec | median | min | max | mean
rtt idle | 4.998s | 156.254 | 6.00ms | 6.00ms | 14.0ms | 6.40ms
async rtt idle | 5.002s | 133.739 | 7.00ms | 7.00ms | 19.0ms | 7.48ms
rtt load (+10x) | 5.036s | 17.473 | 56.0ms | 48.0ms | 87.0ms | 57.2ms
async rtt load (+10x) | 5.017s | 12.158 | 83.0ms | 67.0ms | 0.106s | 82.3ms
Cleanup ... complete
@kaikreuzer like this: | Test | req/sec | max resp time (ms) |
---|---|---|---|
auth off | 156.2 | 14 | |
with fix | 150.0 | 12 | |
current | 7.0 | 180 |
@kaikreuzer I took the time and tried to run the benchmark on a Pi3 with openHABian. The Benchmark didn't finish because it congested under load and the rest api server disconnected. That's why you don't see values for rtt under load.
I am getting ~0.3 req/sec so I can change/request one item every three seconds. I really hope these numbers make it clear that this is not usable at all.
+------------------------------------------------------------------------------+
| openHAB |
+------------------------------------------------------------------------------+
Bench item operations ... done!
| dur | per sec | median | min | max | mean
create item | 1003s | 0.299 | 3.340s | 3.258s | 4.734s | 3.345s
update item | 984.2s | 0.305 | 3.278s | 3.243s | 3.395s | 3.281s
delete item | 975.8s | 0.307 | 3.255s | 3.219s | 3.317s | 3.253s
Bench item state update .... done!
| dur | per sec | median | min | max | mean
rtt idle | 6.556s | 0.305 | 3.278s | 3.236s | 3.319s | 3.278s
async rtt idle | 6.541s | 0.306 | 3.271s | 3.267s | 3.274s | 3.271s
rtt load (+10x) | 0.0us | 0.000 | 0.0us | 0.0us | 0.0us | 0.0us
async rtt load (+10x) | 0.0us | 0.000 | 0.0us | 0.0us | 0.0us | 0.0us
Because we use openHAB we could also use openHAB itself for using a workaround:
cat /etc/openhab/items/rest.items
:
Switch AuthenticationSupportRestInterfaceSwitch { channel="exec:command:rest:run" }
String AuthenticationSupportRestInterface { channel="exec:command:rest:output" }
Switch AuthenticationSupportRestInterfaceSwitchActivation
cat /etc/openhab/misc/exec.whitelist
:
/usr/bin/sshpass -p habopen /usr/bin/ssh -tt -o StrictHostKeyChecking=no -p 8101 openhab@localhost 'bundle:status org.openhab.core.io.rest.auth'
/usr/bin/sshpass -p habopen /usr/bin/ssh -tt -o StrictHostKeyChecking=no -p 8101 openhab@localhost 'bundle:stop org.openhab.core.io.rest.auth'
/usr/bin/sshpass -p habopen /usr/bin/ssh -tt -o StrictHostKeyChecking=no -p 8101 openhab@localhost 'bundle:start org.openhab.core.io.rest.auth'
cat /etc/openhab/things/rest.things
:
Thing exec:command:rest [ command="/usr/bin/sshpass -p habopen /usr/bin/ssh -tt -o StrictHostKeyChecking=no -p 8101 openhab@localhost 'bundle:status org.openhab.core.io.rest.auth'", interval=0, autorun=false ]
cat /etc/openhab/rules/rest.rules
:
Switch AuthenticationSupportRestInterfaceSwitch { channel="exec:command:rest:run" }
String AuthenticationSupportRestInterface { channel="exec:command:rest:output" }
Switch AuthenticationSupportRestInterfaceSwitchActivation
ubuntu@ubuntu:/etc/openhab$ cat rules/rest.rules
rule "Split AuthenticationSupportRestInterface"
when
Item AuthenticationSupportRestInterfaceSwitch changed to ON or Item AuthenticationSupportRestInterfaceSwitchActivation changed
then
createTimer(now.plusSeconds(2), [ |
var MyString = AuthenticationSupportRestInterface.state.toString
MyString = MyString.split('\n').get(0)
AuthenticationSupportRestInterface.postUpdate(MyString)
])
end
rule "Split AuthenticationSupportRestInterface"
when
Item AuthenticationSupportRestInterfaceSwitchActivation changed
then
AuthenticationSupportRestInterfaceSwitch.sendCommand(ON)
end
rule "AuthenticationSupportRestInterface OFF"
when
System started or Item AuthenticationSupportRestInterfaceSwitchActivation changed from ON to OFF
then
executeCommandLine("/usr/bin/sshpass","-p","habopen","/usr/bin/ssh","-tt","-o","StrictHostKeyChecking=no","-p","8101","openhab@localhost","bundle:stop","org.openhab.core.io.rest.auth")
createTimer(now.plusSeconds(3), [ |
AuthenticationSupportRestInterface.postUpdate("Resolved")
])
end
rule "AuthenticationSupportRestInterface ON"
when
Item AuthenticationSupportRestInterfaceSwitchActivation changed to ON
then
executeCommandLine("/usr/bin/sshpass","-p","habopen","/usr/bin/ssh","-tt","-o","StrictHostKeyChecking=no","-p","8101","openhab@localhost","bundle:start","org.openhab.core.io.rest.auth")
end
rule "AuthenticationSupportRestInterfaceAcitvation"
when
System started or Item AuthenticationSupportRestInterfaceSwitchActivation changed or Item AuthenticationSupportRestInterface changed
then
if(AuthenticationSupportRestInterfaceSwitchActivation.state == "Active"){
AuthenticationSupportRestInterfaceSwitchActivation.sendCommand(ON)
}
if(AuthenticationSupportRestInterfaceSwitchActivation.state == "Resolved"){
AuthenticationSupportRestInterfaceSwitchActivation.sendCommand(OFF)
}
end
In your Sitemap you can add:
Text item=AuthenticationSupportRestInterface label="Authentication Support for the REST Interface [%s]"
Switch item=AuthenticationSupportRestInterfaceSwitch
Switch item=AuthenticationSupportRestInterfaceSwitchActivation
Because I have created a rule which will switch the AuthenticationSupportRestInterface OFF when the System is started it could work as a workaround. The rest, because I have made, can be used to turn this on and off arbitrarily via the openHAB Sitemap.
Sorry - I don't understand what you are trying to achieve. Temporarily disabling auth resulted in different problems which I have documented in another issue (I don't remember them exactly).
With basic auth the performance problem is non-existent and effectively fixed.
Test: Create 300 Number items and measure how long it takes. Post an update to 300 Number items and see how long it takes until all items report the new state through SSE events.
While doing the other tests: Ping an item every 10 secs and measure how long it takes until the new state is reported back through an SSE events.
Both tests were done with
3.0.0.M5
CPU spikes to 100% on all cores
2.5.3
Absolutely no CPU spike whatsoever
Changes compared to OH2: