gco / xee

Automatically exported from code.google.com/p/xee
74 stars 8 forks source link

Improve directory scanning performance #242

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. have a directory with 24000+ images
2. delete any image in this directory from inside Xee
3. Xee locks up after 0.2 seconds for a good 20 seconds on my system

What is the expected output? What do you see instead?

Ability to navigate other images instantly, without the need to wait till it 
rescans the directory.

I've created a patch that partially solves the problem (speeds up the scanning 
up to 2 seconds), but still I have to wait after I delete every image, which is 
inconvinient. Patch is included in attachment.

You can download my attached binary (attached too) to test how it works.

Original issue reported on code.google.com by quen...@gmail.com on 6 Nov 2010 at 1:19

Attachments:

GoogleCodeExporter commented 9 years ago
As far as I can tell, this patch might speed that up, but it will instead not 
detect when a file is renamed in the directory, thus I can't really apply it. 
XeeFSRef should not really do any caching. Currently, XeeFileSource caches 
before sorting already, but it needs to re-cache for each sorting.

Original comment by paracel...@gmail.com on 6 Nov 2010 at 1:49

GoogleCodeExporter commented 9 years ago
Ok, but at least try to do something with it, cause number of geteuid(), 
getattrlist() and getdirentriesattr() syscalls are off the charts.

That's what happens on stock 2.1.1 if I delete one image in a directory that 
has 23635 images in them (23624 after a deletion) plus 6 non-image entries, top 
3 called syscalls:
geteuid                                    175705
getattrlist                                137465
getdirentriesattr                           34565

Got that using this command:
$ sudo dtruss -c -n Xee

geteuid(), getattrlist() and getdirentriesattr() get called from 
FSGetCatalogInfo() and FSGetCatalogInfoBulk(), you can check that yourself:

$ sudo dtruss -n Xee -faces -t geteuid
$ sudo dtruss -n Xee -faces -t getattrlist
$ sudo dtruss -n Xee -faces -t getdirentriesattr

Reducing number of calls to FSGetCatalogInfo() to 1 per XeeFSRef reduces dir 
scan time from 15 seconds to 1.

Before (just uncommented your NSLog() call in XeeDirectorySource):
06/11/10 17:21:00   Xee[13736]  readDirectory: 16.01793 s read, 0.475021 s sort, 
16.49295 s total
06/11/10 17:21:22   Xee[13736]  readDirectory: 15.70392 s read, 0.520544 s sort, 
16.22447 s total

After (with this patch):
06/11/10 17:23:29   Xee[13794]  readDirectory: 1.25024 s read, 0.075161 s sort, 
1.32541 s total
06/11/10 17:23:35   Xee[13794]  readDirectory: 1.04785 s read, 0.127194 s sort, 
1.17505 s total

So, it's not just 'might' speed that up, it definitely speeds that up. And 
sorting is just a fraction of time spent in readDirectory().

It might be useful recreate 25k images in a single directory.

Original comment by quen...@gmail.com on 6 Nov 2010 at 2:27

GoogleCodeExporter commented 9 years ago
I agree that my patch is not the best solution, though. But 16s vs 1s is 
noticeable speedup.

Original comment by quen...@gmail.com on 6 Nov 2010 at 2:32

GoogleCodeExporter commented 9 years ago
It would be nice to do something about it, but it is far from obvious what can 
be done without breaking other functionality. Massive amounts of files in a 
single directory are a rare enough occurance that it isn't that much of a 
priority.

Original comment by paracel...@gmail.com on 6 Nov 2010 at 2:58

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Ok, in my case massive amounts of files occur everyday and the binary with this 
patch aplied is the only thing that keeps me from looking for alternatives.

Original comment by quen...@gmail.com on 6 Nov 2010 at 3:02

GoogleCodeExporter commented 9 years ago
Separate diff of FSGetCatalogInfo() call reduction.

Original comment by quen...@gmail.com on 6 Nov 2010 at 3:33

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by paracel...@gmail.com on 12 Apr 2011 at 5:00

GoogleCodeExporter commented 9 years ago
3.0 uses NSURLs instead of FSRefs for storing file lists, so this code is no 
longer applicable.

Hopefully NSURLs are also fast, but I have done no real testing.

Original comment by paracel...@gmail.com on 18 Jan 2013 at 11:33