zlatinb / muwire

MuWire file sharing client for I2P
GNU General Public License v3.0
191 stars 27 forks source link

Very poor UI responsiveness while sharing large files #84

Closed Searinox closed 2 years ago

Searinox commented 2 years ago

This must have started some time during the new build releases.

The setup involves hashing a series of large, multi-GB files. At the same time, another user is downloading a(different) large file from the instance in question. I am not sure to what extent this is contributing but I am mentioning it here.

The interface is being very unresponsive. It is accepting a click every ~30 seconds, I think in between finishing sharing a file and moving on to the next one. I tabbed out of the Library but the issue persists no matter where I am.

Searinox commented 2 years ago

There is no filter active in the Library at the time of this issue.

Searinox commented 2 years ago

Can confirm that the UI is being "released" to interact when hashing transitions to another file. It has moved on to smaller files and the UI is now more responsive but still very choppy. It appears to be queuing up and processing exactly one user event(click, tab etc.) in between the sharing and nothing during it. And when it got to the really big files it was utterly unusable.

Searinox commented 2 years ago

image From what I can tell the hashing thread is the only one that's really pushing the CPU and it's fully using its core. I don't think the UI thread is stuck in a simple code loop for two reasons:

One is that already in the screenshot it doesn't seem like there's more than one thread being busy. If the UI thread was also trying to execute code in a loop I'd expect it to be taking up another core.

Two, there are subtle behavior differences I'm acquainted with when UI code is actually being held up. Such as lack of response to being moved around, attempts to do multiple clicks while hung resulting in the UI being whited out and a "(Not responding)" being added to the window title, and complete lack of response to right-click from the tray icon, which isn't the case.

I'd think something is stuck in a wait loop for another event or perhaps drawing is disabled and only re-enabled when files complete.

I could however, be all wrong about this for two reasons: I don't actually know how much CPU the hashing takes. If it's disk-bottlenecked(and I do know disk is at 100%) then the hashing itself may only be using a small percentage and what I'm looking at there is indeed the UI thread. Two, I can't fetch the stack and from the calls listed in the screenshot I don't actually know which thread is doing what.

zlatinb commented 2 years ago

Hmm, the hashing should be completely independent from the UI thread. You can fetch a stack by using the jstack command from a command prompt. First get the process id of the MuWire process with jps, then pass that value to jstack.

The stack trace will print all threads; the one you're interested in is called AWT-EventQueue-0 . Try to hit jstack exactly while the UI is unresponsive, that will show us what is going on.

Searinox commented 2 years ago

It is indeed the UI thread that's stuck looping. The hashing has finished and I can confirm that it fully hashed all of the files because the disk activity has stopped and I then went on to a 2nd system and used it to get the file list of the bugged one and it's showing everything I expected to be hashed correctly, all the while the UI has now frozen - it's showing fewer than the true total files and claims to still be hashing one file that I know it did hash. Meanwhile MuWire is still fully using one core. I'm going to do the jstack thing now.

Searinox commented 2 years ago

As we speak the thread is still looping and the UI remains frozen so this should catch the bug in the act. Here's what I got:

"AWT-EventQueue-0" #21 prio=6 os_prio=0 cpu=6382812.50ms elapsed=18904.32s tid=0x00000167ffa1d800 nid=0x4214 runnable [0x00000037266fb000] java.lang.Thread.State: RUNNABLE at java.lang.AbstractStringBuilder.append(java.base@11.0.12/Unknown Source) at java.lang.AbstractStringBuilder.append(java.base@11.0.12/Unknown Source) at java.lang.StringBuilder.append(java.base@11.0.12/Unknown Source) at sun.text.normalizer.Norm2AllModes$NoopNormalizer2.normalize(java.base@11.0.12/Unknown Source) at sun.text.normalizer.NormalizerBase.nextNormalize(java.base@11.0.12/Unknown Source) at sun.text.normalizer.NormalizerBase.next(java.base@11.0.12/Unknown Source) at java.text.CollationElementIterator.next(java.base@11.0.12/Unknown Source) at java.text.RuleBasedCollator.compare(java.base@11.0.12/Unknown Source)

zlatinb commented 2 years ago

Ok, what this stack trace tells me is that the library table is sorted by some column and that it's stuck sorting because a new upload was started. Approximately how many files are you sharing? The performance of the UI will be affected by the number of files, not by their size.

zlatinb commented 2 years ago

I can speed up the sorting process significantly by using a "stupid" string comparator, one that doesn't obey locale sorting rules. Some may see it as a regression though.

Searinox commented 2 years ago

Several tens of thousands of files. I can't switch to the table tab anymore obviously to confirm but I also cannot deny that maybe at some point I did indeed sort the table because this instance has been running for a while now. Oddly enough it isn't unsticking anymore though, it's gone from what was slow updates to complete halt.

I can test whatever fix you introduce. I will also proceed to terminate the process and unshare then re-share that folder this time knowing I never sorted the list and see if I can definitively confirm the cause.

Can you intercept the sort event call and, if hashing is ongoing, instead set a flag and exit, then when all hashing tasks are finished, if the flag is set, unset it and programmatically call the sort? This would disable sorting while hashing but then immediately process it once hashing tasks are all done. It would have to be the entire hashing task, not just the currently hashed file.

zlatinb commented 2 years ago

With the current hashing architecture I don't know how many pending files are there to hash, so to do what you propose I'll need to rewrite the hashing subsystem. I am considering doing that as part of #66 issue 3 anyway, but for the time being some other solution will be needed.

Please try https://muwire.com/downloads/MuWire-0.8.9-GitHub84.zip . That build uses a simple case-insensitive sorting of the Name column in the library table. I haven't committed it to git yet because I'm still undecided if it's the best thing to do.

To reproduce the issue, kill the stuck instance and restart it. Even while the files a loading, sort the Library Table view by Name. After all files load try hashing some files, or uploading.

Searinox commented 2 years ago

I did as you asked. At the time of clicking, it had only loaded around 4000 files into the Library. As soon as I clicked the Name column to sort, MuWire became fully unresponsive and did not respond to my attempt to switch to another tab. After a long time it finally switched to that tab but remained largely unresponsive, even though it was no longer onscreen.

If I may remind you of another idea, you made several tabs stop updating the list while unfocused and that at least made the application responsive while not in the affected tab - in this case Library. I can see use for it here as well.

zlatinb commented 2 years ago

Ok, partially implemented in 84-2; more specifically the library UI gets updated once a second regardless of tab order. Let's see how that behaves in the test case and if it's still slow I'll implement not updating unless the tab is visible.

https://muwire.com/downloads/MuWire-0.8.9-GitHub84-2.zip

Searinox commented 2 years ago

My experience was the following:

Starting up, the UI was normally very responsive even in Table mode. After a few thousand files I sorted by name. There was some strain but not severe. In the sense that if I was to hold down a letter and write an entire line of it in the searchbox, you could see a low framerate of updates, but it was subtle.

Tabbing away while this was going on and trying to type in the searchbox again, it was fully responsive. I then went back to the Library.

Performance was more or less the same. After around 20000 files however, the lag became noticeable. Trying to fill the searchbox with a single character there were around 2-3 updates per second. Tabbing away the UI was more responsive but it was starting to show some signs of lag, similar to what was in Library itself after the first few thousand. Still very usable though.

If you're going to implement not updating while the tab is visible, the conditions should be if either Tree view is selected in Library or Library itself is tabbed out of. That way even being in the Library shouldn't lag if it's in Tree view.

Either way this is a huge improvement over initial behavior, where the UI would freeze for tens of seconds up to minutes at a time and more than once required terminating.

zlatinb commented 2 years ago

In 84-3 the Library UI will only get refreshed if the tab is visible. Refreshing happens on a timer, so it may be up to a second after selecting the tab. It will refresh regardless whether Tree or Table views are selected.

Please test and if you're happy please close the issue https://muwire.com/downloads/MuWire-0.8.9-GitHub84-3.zip

Searinox commented 2 years ago

Yes there is no noticeable lag when tabbed away at all now. The issue is fixed.