Open star-buck opened 6 years ago
As stated before I'm not super convinced by redditor comments, I've not seen anything confirmed on KF5.
But it'll put everyone at ease if we do some thorough tests.
Here's my proposed plan:
Things to test:
test copying files several hundred times with randomly generated file sets.
test subfolder not readable
test hidden dirs
test chown/chmod at destination failing
validate (by code review) that breadth first search will always be in order
test subdirs already cached from previous copy (with cached failure?)
fake a hypothetically crashing kio_file - what happens?
fake a crashing kuiserver - what happens?
recursive symlink - what happens?
test UTF-8 file collisions when run with C-locale in-process. file names are blind char*, encoding decoding is up to the app, what if this causes collisions when converting to QUrl?
first test but with additional conversion: from $format1 to $format2: ext2, iso9660, ntfs, vfat with long file names, truncatable file names, etc
investigate this line: if (QT_LSTAT(ep->d_name, &st) == -1) { continue; // how can stat fail? }
Test faking a fail, (ld_preload?) Add an assert regardless
Use of QDir::setCurrent this is global, check thread count of kio file slave process and check there's no scope there for issues
(I have some wayland stuff to finish this week, will probably start next week)
There's a second issue of plasmashell using high CPU whilst we're copying, that's still a WIP, but somewhat separate.
Before start doing that, someone mentioned while copying the modified/added date(s) are not kept/are changed?
I suggest fixing that first and then I could also simply do a test copying a 1TB NAS drive to a local attached usb drive and back to see a) how long that takes (comparing it to a same sample from same nas to same usb on windows7) b) if file count stays the same c) if there is anything weird like messed up dates or changed/shortened filenames or filesizes spottable.
On Sat, Nov 10, 2018, 03:39 davidedmundson notifications@github.com wrote:
As stated before I'm not super convinced by redditor comments, I've not seen anything confirmed on KF5.
But it'll put everyone at ease if we do some thorough tests.
Here's my proposed plan:
Things to test:
-
test copying files several hundred times with randomly generated file sets.
test subfolder not readable
test hidden dirs
test chown/chmod at destination failing
validate (by code review) that breadth first search will always be in order
- what happens if it's not
test subdirs already cached from previous copy (with cached failure?)
fake a hypothetically crashing kio_file - what happens?
fake a crashing kuiserver - what happens?
recursive symlink - what happens?
test UTF-8 file collisions when run with C-locale in-process. file names are blind char*, encoding decoding is up to the app, what if this causes collisions when converting to QUrl?
first test but with additional conversion: from $format1 to $format2: ext2, iso9660, ntfs, vfat with long file names, truncatable file names, etc
investigate this line: if (QT_LSTAT(ep->d_name, &st) == -1) { continue; // how can stat fail? }
Test faking a fail, (ld_preload?) Add an assert regardless
Use of QDir::setCurrent this is global, check thread count of kio file slave process and check there's no scope there for issues
(I have some wayland stuff to finish this week, will probably start next week)
There's a second issue of plasmashell using high CPU whilst we're copying, that's still a WIP, but somewhat separate.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/blue-systems/plasma-5.14/issues/139#issuecomment-437577906, or mute the thread https://github.com/notifications/unsubscribe-auth/AAzRhoLn5lxU2ttJSj5mojDiYAO3jiwKks5utrrigaJpZM4WMwA4 .
In my initial brief tests timestamps were transferred correctly.
I have a real reproducible (though obscure) bug with trash:/ to fix. I'm doing that first as there's a chance that might lead to discovering issues with generic error handling that files also uses.
how long that takes
I know this github ticket refers to KIO copying being slow but it'll help me if we treat KIO being potentially faulty and KIO being slow as two separate tasks.
Update: I fixed the issue in trash I mentioned. Unfortunately the bug was local to within trash, and wouldn't have fixed anything else. At least we fixed something in this.
In all my copying, I have found that copying to NTFS is behaving weird. File count remained the same, but there's an oddity when viewing the folder size in dolphin that shows them being different. That gives me something concrete to follow up on.
Timestamp would be nice to have fixed On Nov 20, 2018 13:56, "davidedmundson" notifications@github.com wrote:
Update: I fixed the issue in trash I mentioned. Unfortunately the bug was local to within trash, and wouldn't have fixed anything else. At least we fixed something in this.
In all my copying, I have found that copying to NTFS does not keep timestamps. Even though that does work with "cp".
File count remained the same, but there's an oddity when viewing the folder size in dolphin that shows them being different. That gives me two concrete things to follow up on.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/blue-systems/plasma-5.14/issues/139#issuecomment-440264036, or mute the thread https://github.com/notifications/unsubscribe-auth/AAzRhhuA_jstE6bd2s24hdgRn4KTILcAks5uw_wIgaJpZM4WMwA4 .
NTFS file size is at least understood - there's a fix on phab, but I'm not sure it'll go in - it's a difference between two file systems.
But I've found something interesting, if you crash or kill dolphin at just the right time it comes up with "copying finished" in plasma rather than an error. This seems critically bad.
But I've found something interesting, if you crash or kill dolphin at just the right time it comes up with "copying finished" in plasma rather than an error. This seems critically bad.
Fixed!
As for being slow, we've got a few things:
use of sendfile rather than reading into memory and writing data. Now merged.
there's something seriously wrong with QCoreApplication::sendPostedEvent() at the event of SlaveBase::dispatchLoop removing this makes my test set of 1million tiny chars considerably faster. As in orders of magnitudes faster. I need to understand this and get a proper fix.
After that I have some two other areas on the slave side, which if fixed could get us ~4x faster on my tiny files dataset.
Whilst we're copying, we're still getting a load of requests to list directories, even though we already do that at the start. We need to see where that's coming from.
SlaveBase::finished reloads a config file (???) We're spending a lot of time here.
Great analysis and progress! Is there anyone else from KDE related to these who can help fixing/improving? On Nov 26, 2018 17:30, "davidedmundson" notifications@github.com wrote:
As for being slow, we've got a few things:
-
use of sendfile rather than reading into memory and writing data. Now merged.
there's something seriously wrong with QCoreApplication::sendPostedEvent() at the event of SlaveBase::dispatchLoop removing this makes my test set of 1million tiny chars considerably faster. As in orders of magnitudes faster. I need to understand this and get a proper fix.
After that I have some two other areas on the slave side, which if fixed could get us ~4x faster on my tiny files dataset.
-
Whilst we're copying, we're still getting a load of requests to list directories, even though we already do that at the start. We need to see where that's coming from.
SlaveBase::finished reloads a config file (???) We're spending a lot of time here.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/blue-systems/plasma-5.14/issues/139#issuecomment-441704579, or mute the thread https://github.com/notifications/unsubscribe-auth/AAzRhl26wo9UCVy4yX1qRZf0FZs7ZBTnks5uzBc3gaJpZM4WMwA4 .
Is there anyone else from KDE related to these who can help
[Redacted - I shouldn't have made hiring notes public]
All my changes are pushed, Kai also did one. They'll be in the frameworks release at the start of Jan.
In terms of speed, there was some mistakes that are fixed. Some significant.
There's lots more little tiny code-polishing cleanups and probably can get CPU usage of the transferring app. There's scope to cut ~20% of the CPU load of the transferring app doing that.
There are some huge refactors I can suggest that will bring bigger savings, but then we're talking larger 1-2 week projects.
I'm moving back to Kwin/Plasma today because of the other BS work.
In terms of file loss, there was an incorrect error handling I found and fixed. I haven't seen anything else, I've not had any data loss, and I've been copying a million files repeatedly. That isn't to say there isn't a bug, but it's certainly not present in the standard case.
That hiring all went a lot faster than expected:
This is a brain dump, then I'll turn it into a more concrete list as separate issues for chimnoy
Cleanup refactor tasks:
Check for abuses of QDateTime converting locales, when we can keep a timestamp in UTC (link to Kai's patch search for more)
also check for misuse of qdatetime as "elapsed timer"
Port remaining QObject::connect(char*, ...)
...warn about the slot overload trap first!
show how to hotspot. Identify some parts together and work through some together.
Big refactors:
make move, copy all the files, then delete all the files, not copy and delete per file as we go through. (will make stuff safer)
file.so btrfs cow support
sendfile64
Monumental refactors:
Handle recursive copying inside the ioslave.
sparse file handling (cp has tonnes of code for doing this, hopefully we could copy it.)
Special casing file.so just as a separate thread in the calling app (requires talking with David F - would possibly be an alternative to recursive copying inside ioslave)
Great. You can also do a new ticket with all points as checkpoints.
Some braindump ideas from me:
PreviewJob
rather than the thumbnail:/
KIO slave, and it currently generates one thumbnail after another and is quite slowabout JobView, I would love a V3. We basically took V2 as-is from KDE4 but the world has changed a bit since:
destUrl
or whatever to provide a meningful summary, and a useful "Open" (if I copied/received a file) or "Open destination" (if I copied a folder) feature.Great. You can also do a new ticket with all points as checkpoints.
I'll reopen each as a new ticket, then we can have a clear discussion there.
As people are apparently reading my KIO notes publicly, there is one monumentally surprising thing that will make a difference.
The Clipboard.
When one cuts/pastes we update the clipboard as we move each file individually so that the clipboard contains the contents of what's remaining. I don't /really/ want to meddle with that behaviour as it seems like its data related.
Clipboards (on X especially) are slow blocking things. Dolphin updates, plasma reads the clipboard, which blocks dolphin which then updates...
This is made a trillion times worse by the fact that every QQuickText field in every application or focus will fetch mimedata info - in order to update a context menu entry that isn't visible. I have a Qt patch for that, but it's blocked on one CI machine on Qt because of a flaky unit test.
https://www.reddit.com/r/linux/comments/9a7k2i/psa_on_kde_never_use_the_gui_when_copying_a_large/