blue-systems / plasma-issues

plasma 5.6 - 5.xx (ongoing)
0 stars 0 forks source link

KIO: copying files over GUI is slow/ possibly faulty #139

Open star-buck opened 6 years ago

star-buck commented 6 years ago

https://www.reddit.com/r/linux/comments/9a7k2i/psa_on_kde_never_use_the_gui_when_copying_a_large/

davidedmundson commented 5 years ago

As stated before I'm not super convinced by redditor comments, I've not seen anything confirmed on KF5.

But it'll put everyone at ease if we do some thorough tests.

Here's my proposed plan:

Things to test:

(I have some wayland stuff to finish this week, will probably start next week)

There's a second issue of plasmashell using high CPU whilst we're copying, that's still a WIP, but somewhat separate.

star-buck commented 5 years ago

Before start doing that, someone mentioned while copying the modified/added date(s) are not kept/are changed?

I suggest fixing that first and then I could also simply do a test copying a 1TB NAS drive to a local attached usb drive and back to see a) how long that takes (comparing it to a same sample from same nas to same usb on windows7) b) if file count stays the same c) if there is anything weird like messed up dates or changed/shortened filenames or filesizes spottable.

On Sat, Nov 10, 2018, 03:39 davidedmundson notifications@github.com wrote:

As stated before I'm not super convinced by redditor comments, I've not seen anything confirmed on KF5.

But it'll put everyone at ease if we do some thorough tests.

Here's my proposed plan:

Things to test:

-

test copying files several hundred times with randomly generated file sets.

test subfolder not readable

test hidden dirs

test chown/chmod at destination failing

validate (by code review) that breadth first search will always be in order

  • what happens if it's not
  • test subdirs already cached from previous copy (with cached failure?)

  • fake a hypothetically crashing kio_file - what happens?

  • fake a crashing kuiserver - what happens?

  • recursive symlink - what happens?

  • test UTF-8 file collisions when run with C-locale in-process. file names are blind char*, encoding decoding is up to the app, what if this causes collisions when converting to QUrl?

  • first test but with additional conversion: from $format1 to $format2: ext2, iso9660, ntfs, vfat with long file names, truncatable file names, etc

  • investigate this line: if (QT_LSTAT(ep->d_name, &st) == -1) { continue; // how can stat fail? }

    Test faking a fail, (ld_preload?) Add an assert regardless

  • Use of QDir::setCurrent this is global, check thread count of kio file slave process and check there's no scope there for issues

(I have some wayland stuff to finish this week, will probably start next week)

There's a second issue of plasmashell using high CPU whilst we're copying, that's still a WIP, but somewhat separate.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/blue-systems/plasma-5.14/issues/139#issuecomment-437577906, or mute the thread https://github.com/notifications/unsubscribe-auth/AAzRhoLn5lxU2ttJSj5mojDiYAO3jiwKks5utrrigaJpZM4WMwA4 .

davidedmundson commented 5 years ago

In my initial brief tests timestamps were transferred correctly.

I have a real reproducible (though obscure) bug with trash:/ to fix. I'm doing that first as there's a chance that might lead to discovering issues with generic error handling that files also uses.

how long that takes

I know this github ticket refers to KIO copying being slow but it'll help me if we treat KIO being potentially faulty and KIO being slow as two separate tasks.

davidedmundson commented 5 years ago

Update: I fixed the issue in trash I mentioned. Unfortunately the bug was local to within trash, and wouldn't have fixed anything else. At least we fixed something in this.

In all my copying, I have found that copying to NTFS is behaving weird. File count remained the same, but there's an oddity when viewing the folder size in dolphin that shows them being different. That gives me something concrete to follow up on.

star-buck commented 5 years ago

Timestamp would be nice to have fixed On Nov 20, 2018 13:56, "davidedmundson" notifications@github.com wrote:

Update: I fixed the issue in trash I mentioned. Unfortunately the bug was local to within trash, and wouldn't have fixed anything else. At least we fixed something in this.

In all my copying, I have found that copying to NTFS does not keep timestamps. Even though that does work with "cp".

File count remained the same, but there's an oddity when viewing the folder size in dolphin that shows them being different. That gives me two concrete things to follow up on.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/blue-systems/plasma-5.14/issues/139#issuecomment-440264036, or mute the thread https://github.com/notifications/unsubscribe-auth/AAzRhhuA_jstE6bd2s24hdgRn4KTILcAks5uw_wIgaJpZM4WMwA4 .

davidedmundson commented 5 years ago

NTFS file size is at least understood - there's a fix on phab, but I'm not sure it'll go in - it's a difference between two file systems.

But I've found something interesting, if you crash or kill dolphin at just the right time it comes up with "copying finished" in plasma rather than an error. This seems critically bad.

davidedmundson commented 5 years ago

But I've found something interesting, if you crash or kill dolphin at just the right time it comes up with "copying finished" in plasma rather than an error. This seems critically bad.

Fixed!

davidedmundson commented 5 years ago

As for being slow, we've got a few things:

After that I have some two other areas on the slave side, which if fixed could get us ~4x faster on my tiny files dataset.

star-buck commented 5 years ago

Great analysis and progress! Is there anyone else from KDE related to these who can help fixing/improving? On Nov 26, 2018 17:30, "davidedmundson" notifications@github.com wrote:

As for being slow, we've got a few things:

-

use of sendfile rather than reading into memory and writing data. Now merged.

there's something seriously wrong with QCoreApplication::sendPostedEvent() at the event of SlaveBase::dispatchLoop removing this makes my test set of 1million tiny chars considerably faster. As in orders of magnitudes faster. I need to understand this and get a proper fix.

After that I have some two other areas on the slave side, which if fixed could get us ~4x faster on my tiny files dataset.

-

Whilst we're copying, we're still getting a load of requests to list directories, even though we already do that at the start. We need to see where that's coming from.

SlaveBase::finished reloads a config file (???) We're spending a lot of time here.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/blue-systems/plasma-5.14/issues/139#issuecomment-441704579, or mute the thread https://github.com/notifications/unsubscribe-auth/AAzRhl26wo9UCVy4yX1qRZf0FZs7ZBTnks5uzBc3gaJpZM4WMwA4 .

davidedmundson commented 5 years ago

Is there anyone else from KDE related to these who can help

[Redacted - I shouldn't have made hiring notes public]

davidedmundson commented 5 years ago

All my changes are pushed, Kai also did one. They'll be in the frameworks release at the start of Jan.

In terms of speed, there was some mistakes that are fixed. Some significant.

There's lots more little tiny code-polishing cleanups and probably can get CPU usage of the transferring app. There's scope to cut ~20% of the CPU load of the transferring app doing that.

There are some huge refactors I can suggest that will bring bigger savings, but then we're talking larger 1-2 week projects.

I'm moving back to Kwin/Plasma today because of the other BS work.


In terms of file loss, there was an incorrect error handling I found and fixed. I haven't seen anything else, I've not had any data loss, and I've been copying a million files repeatedly. That isn't to say there isn't a bug, but it's certainly not present in the standard case.

davidedmundson commented 5 years ago

That hiring all went a lot faster than expected:

This is a brain dump, then I'll turn it into a more concrete list as separate issues for chimnoy

Cleanup refactor tasks:

Big refactors:

Monumental refactors:

star-buck commented 5 years ago

Great. You can also do a new ticket with all points as checkpoints.

kbroulik commented 5 years ago

Some braindump ideas from me:

about JobView, I would love a V3. We basically took V2 as-is from KDE4 but the world has changed a bit since:

davidedmundson commented 5 years ago

Great. You can also do a new ticket with all points as checkpoints.

I'll reopen each as a new ticket, then we can have a clear discussion there.

davidedmundson commented 5 years ago

As people are apparently reading my KIO notes publicly, there is one monumentally surprising thing that will make a difference.

The Clipboard.

When one cuts/pastes we update the clipboard as we move each file individually so that the clipboard contains the contents of what's remaining. I don't /really/ want to meddle with that behaviour as it seems like its data related.

Clipboards (on X especially) are slow blocking things. Dolphin updates, plasma reads the clipboard, which blocks dolphin which then updates...

This is made a trillion times worse by the fact that every QQuickText field in every application or focus will fetch mimedata info - in order to update a context menu entry that isn't visible. I have a Qt patch for that, but it's blocked on one CI machine on Qt because of a flaky unit test.