Huge latency when using gateone on embedded systems

koenkooi commented 13 years ago

I'm deploying gateone on a systems with a 720MHz - 1GHz ARM cpu and I'm seeing a huge latency when typing into the browser terminal. It's letting me enter text at a rate of 1 char/s, but CPU load on the 720MHz board is ±15%.

Could this be related to having the multiprocessing code disabled?

liftoff commented 13 years ago

No, this would have nothing to do with multiprocessing. It shouldn't be that bad (1 CPS). Can you past your server.conf and the specifics of the hardware in question?

I have a RouterStation lying around I can test with. It's not ARM (MIPS) but it should be "close enough" to determine if something like CPU speed is the problem or if it has something to do with running on an embedded environment (minimal: busybox, limited shell/libraries,etc) is the issue. I remember experiencing something similar when I developed Escape From the Web but I've since forgotten the cause or how I got around it.

Also, can you describe the issue with a bit more detail? If you type a whole lot of characters into the terminal do they come out at 1CPS or do they show up in chunks at a rate of about 1 screen refresh/second? If it's the former I'll have my head scratching. If it's the latter it can be caused by a number of things that can be tweaked.

koenkooi commented 13 years ago

Autogenerated server.conf at the bottom of this report.

The specific hardware is this board: http://beagleboard.org/bone , it has a 720MHz cortex A8, 256MB of ram, 100MB/s ethernet and a 4GB micro-SD card is used for the rootfs. I have full bash and openssh and python 2.7.

When I browse to https://beaglebone.local with google chrome on my mac (quad i7) anything I enter will take ±1 second to show up. If I select all the text of your reply above and paste it into gateone, it will take ±2 seconds for the complete paste to show up.

The server.conf:

sso_service = "HTTP" locale = "C" https_redirect = False pam_service = "login" syslog_facility = "daemon" disable_ssl = False session_logging = True session_dir = "/tmp/gateone" cookie_secret = "ZTBiZTRiNjEyYTcxNDUzZTg1YmE3NGNhNDcwMGZjMWE5Y" syslog_session_logging = False address = "" port = 443 user_dir = "/var/lib/gateone/users" log_file_num_backups = 10 logging = "info" dtach = True certificate = "certificate.pem" keyfile = "keyfile.pem" sso_realm = None log_to_stderr = False log_file_max_size = 104857600 session_timeout = "5d" command = "/var/lib/gateone/plugins/ssh/scripts/ssh_connect.py -S '/tmp/gateone/%SESSION%/%SHORT_SOCKET%' --sshfp -a '-oUserKnownHostsFile=%USERDIR%/%USER%/known_hosts'" embedded = False debug = False js_init = "" auth = None log_file_prefix = "/var/log/gateone/webserver.log" pam_realm = "beaglebone"

liftoff commented 13 years ago

Can you do me a favor and let me know if turning off session logging helps? Gzip compression can be quite slow on embedded systems and I'd imagine that constantly writing data to disk isn't a great idea on such systems either.

Also, try setting logging to "error". That might speed things up a bit as well.

Thinking about this a bit, If disabling session logs really speeds it up a lot I'm guessing that the problem has to do with the code blocking when it writes to .golog files. I also might want to add the option to save .golog files in an uncompressed format for embedded use. Or I could just make it so that logviewer.py can read and play back syslog-formatted logs generated by Gate One (it's in the TODO list but it would be a challenge so it's not a priority right now).

koenkooi commented 13 years ago

with session_logging set to False: no perceptible difference with logging set to "error": no perciptible difference

Top still only shows 15%, so this might not be a real CPU bound issue. The SD activity indicator LED also stays nice and dark, so no disk IO going on by the looks of it.

liftoff commented 13 years ago

Try changing the command from ssh_connect.py to just "/bin/login" and see if that makes a difference. That will let me rule out some stuff.

koenkooi commented 13 years ago

that didn't help, I also get the slowness at the "Host IP/URL" prompt before it will spawn ssh.

liftoff commented 13 years ago

Here's something else to try: Change the address to be "0.0.0.0" and set debug to True. This should ensure that only one instance of Gate One is started.

Also, does the problem present in all browsers you've tried? Oh, and try manually setting the rows/cols to 24/80 in the interface to see if that speeds it up.

If none of these things work then it is likely a bug (maybe in Tornado?) and not merely some performance-related issue (which the CPU utilization is already giving good indications for).

Gwindalmir commented 13 years ago

I've noticed the same performance issue. When it's really slow, I get about 2 chars/s. The server is on a 1.3GHz Athlon system (non-embedded). It does seem to have gotten worse a few weeks ago, but that might be a coincidence. It's like there's some lag between when I type the key, and when I see the result. It's more noticeable when I'm using the arrow keys in an application.

liftoff commented 13 years ago

Well, there should always be some lag between when you press a key and when that keystroke is registered since it has to go to the Gate One server, to the SSH server, and then that SSH server has to update the screen which has to go back through the Gate One server and then back to the client where a screen update takes place. In most situations this should be faster than you can type but in situations where a key is held down (key repeat) this can cause a bit of a lag.

There's really nothing I can do about it other than maybe move cursor tracking to the client side of things and that can be very problematic. Reducing the number of rows/columns can really speed up situations like this but it ultimately boils down to the fact that the server has to render a lot of lines of HTML and send them to your client which then has to refresh the terminal. I do have plans to make this communications much more efficient but it will require some major code rewrites so don't expect it until Gate One 2.0.

As to the problem reported by koenkooi, I believe it has something to do with the way Tornado works to watch the underlying file descriptor for updates on embedded platforms but until I get my hands on a beaglebone I won't know for sure. I'll be ordering one shortly (only $89 and it looks like a sweet toy regardless =).

liftoff commented 13 years ago

I almost forgot to mention this: In earlier versions of Gate One I had the cursor position sent to the client directly as its own object (x,y) and then the client would wrap that posiition with the [span class="cursor"] [/span] tags (Github doesn't like gt or lt symbols). However, it turned out to be quite CPU intensive on the client and was actually slower than doing the cursor rendering on the server side of things.

Having said that, now that I've got that kind of processing going on inside of an HTML5 Web Worker I may need to revisit this. Web Workers allow you to keep a web page running smoothly while CPU-intensive stuff takes place in a different thread (the worker). So this just might work out. In fact, I'm thinking about ways I can do this pretty quickly in my head right now :)

koenkooi commented 13 years ago

Setting rows/col to 80/24 manually seems to have fixed it! Can those be set in server.conf as well?

liftoff commented 13 years ago

The rows/columns can be configured globally using the (recently-added) js_init option in server.conf. Just set it like so:

js_init = "{rows: 24, cols: 80}"

Having said that, I just tested it out and sure enough it applies the setting when Gate One is loaded. However, the sendDimensions() function doesn't take this into account so it will get overridden by the user's browser. I'll try to get a commit in tonight that fixes that.

So put that in your server.conf and sit tight. A fix should be in tonight's commit.

liftoff commented 13 years ago

As a side note I'm really curious why having rows/cols higher than 24/80 makes it go super slow. Does it get progressively slower as you add more rows/cols or is it an on/off thing where anything higher than 24/80 drops it to 1-2CPS?

liftoff commented 13 years ago

I just pushed a commit that should make:

js_init = "{rows: 24, cols: 80}"

...work properly. Please confirm, thanks!

Gwindalmir commented 13 years ago

I was using it last night for about two hours, in a maximized browser window (1680x1050). After a while (more than 45mins), the session would get very laggy, and I'd get about 1 cps (maybe less). This alternated a few times until I got off. Now I haven't tried fixing the resolution, but it was never that bad before. I was on wireless, but I was also on the local network. Is there any information I can send? I had session logging turned off already, so that didn't help (plus I can't send it).

liftoff commented 13 years ago

When this happened, did reloading the page fix the problem or was it stuck like that until you logged out of the terminal? I ask because I think it may have to do with your very large terminal window and the fact that the scrollback buffer is saved to localStorage every time there's a screen update.

I ran into a similar problem with the client-side logging feature. I ended up disabling it completely because over time the terminal would get so slow it would eventually become unusable. In your case, terminal lines are really long so you might be pushing the limits of your browser's ability to parse and re-save localStorage when you hit some threshold of characters.

Try this: Change the number of scollback buffer lines to, say, 100 in the settings panel. Then use Gate One as you normally would and try to reproduce the environment you were in at the time. Let me know if it does or doesn't happen with a smaller scrollback buffer. If reducing the number of scrollback buffer lines improves performance considerably I might have to built in some logic that takes the width of each line into consideration before it decides to write it out to localStorage. Hmm... Or I could just disable persistent scrollback buffers altogether.

Do you think having the ability to restore your scrollback buffer after closing or reloading the Gate One tab is an important feature? I bet performance would improve by a non-trivial amount by turning it off (you could try it yourself: Set the number of lines to 0). Something to think about, anyway.

For reference, tonight I took a break from coding and got to work on the new liftoffsoftware.com. I don't know about you, but to me it seems like Gate One is getting near that "1.0 state" (if you know what I mean). I have to fix the bookmark manager, add SSH key management, and fix whatever I did recently that broke session recordings (the real-time view--saving them still seems to work =) but that's pretty much it. Bugs like the one reported in this issue are becoming less frequent which I'm hoping is a good sign and not merely an indication that less people are using Gate One :)

liftoff commented 13 years ago

Actually, rather than have you set the scrollback buffer to 0 I just pushed a commit that disables saving the scrollback buffer to localStorage altogether. Please pull down the latest changes and see if that makes a non-trivial performance improvement for you. On my system I haven't noticed much of a change but my resolution isn't anywhere near as big as yours.

Let me know, thanks!

Gwindalmir commented 13 years ago

Scrollback is important, but that's something I'd rather have done on the server side, IMO. The idea behind that is when GateOne finally supports being able to resume a session on a different browser/machine, I'd like to have access to the scrollback buffer. Similar to how 'screen' works.

Anyway, I just pulled it. I had the same issue tonight (although it wasn't as bad as before). Long sessions I can easily see the slowdown, so I'll try again tomorrow and let you know.

liftoff commented 13 years ago

Interesting idea... I can change the logic to keep the scrollback buffer on the server side. Wouldn't be too much trouble. However, it would mean that the amount of lines in the scrollback buffer would have to be controlled on the server-side of things as opposed to the client. I could make it a server-side value that's configurable on a per-user basis but that would not be simple to implement. I'll have to think about it. It might have to wait until after 1.0.

liftoff commented 13 years ago

I just pushed a commit that changed a number of things in regards to this issue:

1) The scrollback buffer is only written to disk after a 2-second timeout now. Also, the timeout gets cleared and restarted if the screen is updated within those two seconds. So it should write a lot less often to disk than it used to. This should keep things fast while preserving the scrollback buffer between page reloats/site visits/reboots/whatever.

2) There's now a 50ms timer on screen updates that works the same way but with a 150ms forced refresh time. It is kind of complicated but the end result is spectacular: Screen updates are MUCH smoother now. Especially when holding down an arrow key in, say, vim. Try it and please report back how it goes.

3) I made it so that when you hit Ctrl-l it will force a refresh of the display--regardless of what the underlying program does with that Ctrl-l keystroke. I added this because there's a small chance the cursor position can leave a "leftover" when you do the aforementioned "hold the key down" test. If that happens and the leftover cursor position annoys you, just hit Ctrl-l and it'll go away. This should be a rare occurrence and I'll be working on making sure it never happens but until then just move the cursor back to that line again (which will force a redraw) or just type Ctrl-l :)

4) The protocol that figures out which lines have changed since the last screen update has been fixed... It actually works properly now! Hah! Unfortunately it also creates that phantom cursor problem I mentioned in item 3 if a screen update is missed (packets lost or whatever). This should result in a serious bandwidth reduction.

Please test these latest changes and let me know how it goes. Thanks!

koenkooi commented 12 years ago

Trying latest git is in my TODO for today, hopefully I'll get to it.

liftoff commented 12 years ago

Awesome. I'm looking forward to your feedback.

I recently added support for disabling text renditions (e.g. colors) to terminal.py but I haven't added a configuration option to turn it on yet. Since it could provide a significant speedup on embedded systems you might want to try it out. Just change the following line (~565 in dump_html()) in termio.py:

result = self.term.dump_html()

...to:

result = self.term.dump_html(False)

The process of rendering the terminal screen into HTML is the most CPU-intensive process in all of Gate One. By neglecting to process text renditions the amount of CPU time required can be reduced by an order of magnitude.

BTW: In my experience the built-in command line tools on embedded systems are often compiled without fancy text/color support anyway (to save space) so it might be a worthwhile change.

koenkooi commented 12 years ago

Doing 'result = self.term.dump_html(False)' makes the terminal go away completely :)

koenkooi commented 12 years ago

Latest git with no changes still has the laggy text input :(

liftoff commented 12 years ago

I've done a bit of experimentation with this on an OpenWRT system and I believe it might have something to do with Python's compile-time flags or Tornado's detection/use of epoll. I'll definitely keep you in the loop as I learn more.

liftoff commented 12 years ago

My Beaglebone should arrive on Monday. I will get to the bottom of this once and for all!

koenkooi commented 12 years ago

Op 19 mei 2012, om 03:23 heeft Dan McDougall het volgende geschreven:

My Beaglebone should arrive on Monday. I will get to the bottom of this once and for all!

After you get it, please update it with http://dominion.thruhere.net/koen/angstrom/beaglebone/Angstrom-Cloud9-IDE-GNOME-eglibc-ipk-v2012.05-beaglebone-2012.05.17.img.xz (400MB)

That is a release candidate of what will ship with the upcoming A6 revision of the beaglebone. It includes a fairly recent gateone.

liftoff commented 12 years ago

Will do. Thanks. On May 20, 2012 5:38 AM, "Koen Kooi" < reply@reply.github.com> wrote:

Op 19 mei 2012, om 03:23 heeft Dan McDougall het volgende geschreven:

My Beaglebone should arrive on Monday. I will get to the bottom of this once and for all!

After you get it, please update it with http://dominion.thruhere.net/koen/angstrom/beaglebone/Angstrom-Cloud9-IDE-GNOME-eglibc-ipk-v2012.05-beaglebone-2012.05.17.img.xz(400MB)

That is a release candidate of what will ship with the upcoming A6 revision of the beaglebone. It includes a fairly recent gateone.

Reply to this email directly or view it on GitHub: https://github.com/liftoff/GateOne/issues/69#issuecomment-5807651

liftoff commented 12 years ago

I've got my Beaglebone booted up and running Gate One. HOLY COW IS IT SLOW! Yowza! There's definitely something broken here....

liftoff commented 12 years ago

Aha! I am making GREAT progress on this. I have got Gate One running on a 64x233 (rows/cols) right now and it is very responsive. Big screen updates can take a second or so to update but for interactive use it is nice and fast.

Mostly this is just configuration changes but there were also a few extra Python modules that were necessary in order to speed it up a bit. I'm going to see if I can figure out a precise list... I was getting close to having a list when In installed "python-modules" thinking to myself, "Gee, I wonder what that is?" LOL.

I notice that colorized logging was enabled after I installed that though--I'm really curious to know which module it was that enabled it.

I'll be researching this further...

koenkooi commented 12 years ago

Op 22 mei 2012, om 03:24 heeft Dan McDougall het volgende geschreven:

Aha! I am making GREAT progress on this. I have got Gate One running on a 64x233 (rows/cols) right now and it is very responsive. Big screen updates can take a second or so to update but for interactive use it is nice and fast.

Mostly this is just configuration changes

Will 'partial' configs be supported? I'd like to ship a config file will all the speed options we need, but without the keys and certificates so gateone wil generate them on first launch.

liftoff commented 12 years ago

"Partial configs" are supported but there's a caveat: Gate One will not update the server.conf with the missing settings. It will simply proceed using defaults for everything. This would work fine for all settings except one: cookie_secret. Getting a new cookie_secret every time Gate One starts isn't so bad though; I've considered making it change at a regular interval anyway.

Having said that, I understand what you're asking for and I'll see if I can make some changes to gateone.py so that it will add any missing settings to the server.conf. Shouldn't be too difficult and it is a good idea anyway.

liftoff commented 12 years ago

I just checked out your fork of the oe-embedded repo and sent you a pull request via Github. Let me know if I sent it to the wrong repo--I noticed that the version there isn't the same as what I got via the setup-scripts repo. I can also package it up in a tarball if you like.

For reference, here's the summary of what I did:

Updated the git revision to the one I just committed. It includes support for the new "partial config" idea. Works GREAT BTW :)
Added two new files: A "partial config" server.conf and a little Beaglebone/GateOne logo I whipped up.
Updated the gateone_git.bb file to install these new files and I also added/removed some dependencies. The only notable ones are: Added python-unixadmin and removed findutils (Gate One doesn't make that external find call anymore).

Overall I'm LOVING my Beagleboard. I'm going to have have a lot of fun with this thing. I need to get a better handle on how to contribute because I have a neat little Gate One plugin that lets you toggle the LEDs on and off based on the VT-100 LED escape sequences. It also shows you the state of each LED (though I'm still working on making it smart enough to know when an LED is in something like "heartbeat" mode).

liftoff commented 12 years ago

Since the new Gate One bitbake package has been added to the Angstrom base I'm going to close this ticket out. The new version no longer suffers this huge latency problem. Thanks for reporting it!

liftoff / GateOne

Huge latency when using gateone on embedded systems #69