Now for something different: increase performance with multi-threads/multi processing

oPromessa commented 2 years ago

Is your feature request related to a problem? Please describe.

Currently processing ~33000 pics of a Photos DB. The full Photos size is ~90000.

It takes about 1.7 hours to process --update phase

Processed: 33764 photos, exported: 0, updated: 0, skipped: 33699, updated EXIF data: 70, missing: 72, error: 0
Elapsed time: 6127.738 seconds

and about 3 hours to process the delete phase

for a total of ~5 hours

[notice] --- Start
Mon 10 Jan 2022 23:52:00 WET
(...)
[notice] --- End
Tue 11 Jan 2022 05:12:07 WET

So, if we extrapolate it could take up to 15 hours to process 90000 pics.

Describe the solution you'd like Maybe launch in parallel multiple copy tasks? Add a flag for --multiprocessing <n>with the level of processes to launch. To achieve this in other scenarios I've used with relative success the library multiprocessing on https://github.com/oPromessa/flickr-uploader/blob/master/lib/mprocessing.py Of course the code must be ware of multiprocessing and use locks to access database and will be dependent of the performance of the underlying HW and LAN.

Describe alternatives you've considered Hmmm. Launch multiple osxphotos at the same time each with it's own selection/filter process. Problem is that, if each one uses its own .osxphotos_db to avoid concurrency you'll end up deleting each other's pics.

Additional context I'm copying over a 100Mbps lan (will upgrade to 1Gbps lan shortly) from an iMac (i7, 16GB, ssd) into a Synology DSM 918+.

RhetTbull commented 2 years ago

I've considered this in the past but the additional complexity has kept me from looking more deeply into this. osxphotos has been optimized for export to the local mac where you're likely to be I/O bound waiting on the disk. It does use copy-on-write to speed up copying, etc. I'm not familiar with Synology devices but assuming they cache writes so they can handle multiple simultaneous writes and your network is fast enough, the approach you propose might work.

The export code would need to be updated and this is no small feat (I'm currently in the middle a big refactoring of the export code that will make it easier to change). Currently, the export code processes all actions serially photo by photo:

download the photo if missing
copy the photo
create sidecars if needed
update with exiftool if needed
process next photo

The cleanup phase happens at the end after all exports are done.

It takes about 1.7 hours to process --update phase

This is surprising given you only exported 70 updated photos. Doing these 70 copies in parallel is unlikely to make much of a dent in the update time. My guess is the slow part is checking the file stat of the 30000 images across the LAN & NAS. If this is the slow part, then throwing more processes at may not help much if the NAS can't keep up.

Before tackling a multithreading project I'd suggest doing some profiling to see where the slowdowns are actually occurring.

I'll take a look at your project to see what you did.

RhetTbull commented 2 years ago

Do you space on your local disk to do an export or a partial export? It would be interesting to compare that to the same export on the NAS.

oPromessa commented 2 years ago

Sorry!!!! rookie mistake I believe the Mac went to sleep half way thru the process. I'll gather more stats with the caffeinate option and come back to you!

RhetTbull commented 2 years ago

A few more thoughts:

The main obstacle to effective multiprocessing in the current design is that the target filename is not fully known until the actual export happens. The CLI code renders the --directory and --filename templates to compute the intended target but then the export code needs to check to see if an edited version will be exported, will side cars be generated, and does the target already exist and if so, is it in the export database and does it match the source photo or is it a name collision. With multiple processes exporting, you'd run the risk of name collisions.

If the code to determine name collisions, etc. happened before calling export, you could deterministically know the target file then you could spawn multiple export processes. One downside with this approach is that you might have some "false" name collisions -- e.g. you thought there would be a name collision but one file was missing and didn't get exported so instead of getting IMG_1234.jpg and IMG_1234 (1).jpg, you got IMG_1234 (1).jpg as the first was missing and didn't get exported. Some of this would be difficult to work around when using --download-missing but I think having some "false" collisions is better then real collisions.

RhetTbull commented 2 years ago

One other thought -- the export database is implemented in sqlite which handles multiprocessing fine so that shouldn't be an impediment.

oPromessa commented 2 years ago

Sorry for the delay. Here's a test with 88 pics comparing local SSD vs NAS (100Mbps) vs NAS (1Gbps). Using --update option.

In a nutshell... go 1Gbps :smiley:
The 2nd run on NAS (100) was a tad faster than NAS (1G) possibly due to additional caching the NAS (100) Server has vs NAS (1G) Server. But is residual the difference.

Run	Pics	SSD	NAS (100)	NAS (1G)	Sec/Pic SSD	Sec/Pic NAS (100)	Sec/Pic NAS (1G)
1st	88	37.158	183.095	99.971	0.422	2.080	1.136
2nd	88	17.630	19.160	19.797	0.200	0.218	0.224

--1st SSD
Processed: 88 photos, exported: 88, updated: 0, skipped: 0, updated EXIF data: 88, missing: 0, error: 0
Elapsed time: 37.158 seconds
--update SSD
Processed: 88 photos, exported: 0, updated: 0, skipped: 88, updated EXIF data: 0, missing: 0, error: 0
Elapsed time: 17.630 seconds

--1st NAS (100Mbps Network)
Processed: 88 photos, exported: 88, updated: 0, skipped: 0, updated EXIF data: 88, missing: 0, error: 0
Elapsed time: 183.095 seconds
--update NAS (100Mbps Network)
Processed: 88 photos, exported: 0, updated: 0, skipped: 88, updated EXIF data: 0, missing: 0, error: 0
Elapsed time: 19.160 seconds

--1st NAS (1Gbps Network)
Processed: 88 photos, exported: 88, updated: 0, skipped: 0, updated EXIF data: 88, missing: 0, error: 0
Elapsed time: 99.971 seconds
--update NAS (1Gbps Network)
Processed: 88 photos, exported: 0, updated: 0, skipped: 88, updated EXIF data: 0, missing: 0, error: 0
Elapsed time: 19.797 seconds

RhetTbull commented 2 years ago

Thanks! This is useful. I've done some profiling on the export code and found it takes a lot of it's time in the code that validates file names to avoid name collisions so that might be worth a look before tackling something like multithreading. I can imagine that if going across a network all the remote file lookups would slow things down. I was also surprised to see the export db is getting opened/closed with each photo. I'd expected it to stay open so I need to take a look there. Full profile results attached.

osxphotos export ~/Desktop/export --from-date 2021-01-01
Using last opened Photos library: /Users/rhet/Pictures/Photos Library.photoslibrary
Exporting 3853 photos to /Users/rhet/Desktop/export...
  [####################################]  100%
Processed: 3853 photos, exported: 372, missing: 3504, error: 0
Elapsed time: 65.114 seconds
         17361339 function calls (16993129 primitive calls) in 83.575 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      4/1    0.000    0.000   83.799   83.799 {built-in method builtins.exec}
        1    0.000    0.000   83.798   83.798 <string>:1(<module>)
        1    0.148    0.148   83.798   83.798 cli.py:1305(_export)
     3853    0.289    0.000   64.202    0.017 cli.py:2634(export_photo)
     3866    0.192    0.000   47.759    0.012 cli.py:3043(export_photo_to_directory)
     3866    0.230    0.000   45.088    0.012 photoexporter.py:483(export2)
     3866    0.043    0.000   40.263    0.010 photoexporter.py:748(_validate_dest_path)
     3866    0.142    0.000   39.708    0.010 utils.py:384(increment_filename_with_count)
     3866    0.104    0.000   38.827    0.010 utils.py:291(findfiles)
   765118   19.986    0.000   32.955    0.000 utils.py:284(normalize_fs_path)
     3866    0.671    0.000   32.654    0.008 utils.py:302(<listcomp>)
        1    0.001    0.001   17.654   17.654 photosdb.py:92(__init__)
        1    8.426    8.426   17.635   17.635 photosdb.py:1618(_process_database5)
     7719    0.075    0.000   13.821    0.002 cli.py:3007(_render_suffix_template)
     3866    0.041    0.000   13.723    0.004 photoinfo.py:1413(render_template)
     3866    0.097    0.000    9.643    0.002 phototemplate.py:354(__init__)
     3866    0.034    0.000    9.438    0.002 export_db.py:763(__init__)
   765131    8.924    0.000    8.924    0.000 __init__.py:77(__enter__)
    79968    8.517    0.000    8.517    0.000 {method 'execute' of 'sqlite3.Cursor' objects}
     3866    0.075    0.000    8.321    0.002 export_db.py:773(_open_export_db)
     3867    0.168    0.000    7.549    0.002 export_db.py:639(_create_db_tables)

profile.txt

And ordered by total time (time spent in the function itself)


Processed: 3853 photos, exported: 372, missing: 3504, error: 0
Elapsed time: 102.641 seconds
         25614407 function calls (25246177 primitive calls) in 119.617 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  2172957   44.963    0.000   73.764    0.000 utils.py:284(normalize_fs_path)
  2172970   19.521    0.000   19.521    0.000 __init__.py:77(__enter__)
        1    8.110    8.110   16.243   16.243 photosdb.py:1618(_process_database5)
  2172970    7.747    0.000    7.747    0.000 __init__.py:80(__exit__)
    79949    6.504    0.000    6.504    0.000 {method 'execute' of 'sqlite3.Cursor' objects}
     3870    4.281    0.001    4.281    0.001 {built-in method posix.listdir}
        1    2.996    2.996    5.074    5.074 _photosdb_process_searchinfo.py:37(_process_searchinfo)
     3866    1.587    0.000   74.578    0.019 utils.py:302(<listcomp>)
  2181137    1.132    0.000    1.132    0.000 {method 'match' of 're.Pattern' objects}
  2181625    0.951    0.000    0.952    0.000 {method 'decode' of 'bytes' objects}
     3867    0.837    0.000    0.837    0.000 {built-in method posix.mkdir}
  1058578    0.773    0.000    1.338    0.000 utils.py:371(normalize_unicode)
        1    0.714    0.714    0.722    0.722 _photosdb_process_scoreinfo.py:40(_process_scoreinfo_5)
     8105    0.678    0.000    0.678    0.000 {method 'commit' of 'sqlite3.Connection' objects}
     3866    0.643    0.000    1.740    0.000 utils.py:303(<listcomp>)
    17247    0.628    0.000    0.657    0.000 {built-in method posix.stat}
  2172970    0.587    0.000    0.587    0.000 __init__.py:72(__init__)
     3877    0.579    0.000    0.579    0.000 {method 'close' of 'sqlite3.Connection' objects}
        1    0.525    0.525    0.562    0.562 _photosdb_process_faceinfo.py:173(_process_faceinfo_5)
  2616628    0.517    0.000    0.517    0.000 {built-in method builtins.isinstance}
     7496    0.472    0.000    0.565    0.000 photoinfo.py:1560(__repr__)
     3879    0.437    0.000    0.437    0.000 {built-in method _sqlite3.connect}
        2    0.399    0.199    0.399    0.199 {method 'fetchall' of 'sqlite3.Cursor' objects}
     3866    0.396    0.000    0.396    0.000 {built-in method posix.rmdir}
    11675    0.364    0.000    0.364    0.000 sre_parse.py:111(__init__)
    40164    0.335    0.000    0.595    0.000 pathlib.py:64(parse_parts)

profile2.txt

RhetTbull commented 2 years ago

Interesting that for the export of 372 files, there were 765118 calls to normalize_fs_path! Definitely need to take a look at what's going on there.

RhetTbull commented 2 years ago

I've added a hidden --profile option to the export command that enables profiling. It also has a hidden --profile-sort option to specify one or more sort keys for profiled stats output. I'll include in the next release.

I made some changes to the filename normalization to minimize calls to normalize_fs_path (which can't be optimized further as it's really only a single line call to an objective-C function). This has reduced the calls to normalize_fs_path for the test case above more than an order of magnitude from 765118 calls to 11610 calls!

The new --profile flag should make it much easier to start doing some optimization.

Profiling...
Exporting 3855 photos to /Users/rhet/Desktop/export...

Processed: 3855 photos, exported: 372, missing: 3506, error: 0
Elapsed time: 123.597 seconds
Profiling completed
         14850806 function calls (14482537 primitive calls) in 149.191 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    80008   25.722    0.000   25.722    0.000 {method 'execute' of 'sqlite3.Cursor' objects}
     3868   24.039    0.006   37.314    0.010 utils.py:306(list_directory)
        1    9.479    9.479   23.764   23.764 photosdb.py:1618(_process_database5)
   733207    8.455    0.000    9.180    0.000 _convenience.py:131(<lambda>)
        1    4.946    4.946    7.770    7.770 _photosdb_process_searchinfo.py:37(_process_searchinfo)
     3869    3.123    0.001    3.123    0.001 {built-in method posix.mkdir}
     3868    2.783    0.001   11.963    0.003 utils.py:314(<listcomp>)
     3879    2.328    0.001    2.328    0.001 {method 'close' of 'sqlite3.Connection' objects}
    11610    2.283    0.000    4.076    0.000 utils.py:284(normalize_fs_path)
    17255    2.193    0.000    2.351    0.000 {built-in method posix.stat}
     3881    2.065    0.001    2.065    0.001 {built-in method _sqlite3.connect}
     3869    1.944    0.001    1.944    0.001 {built-in method posix.rmdir}

profile_new.txt

oPromessa commented 2 years ago

I'll use it once available. I still have a few thousand pics to transfer so I'll have plenty of opportunity to test it.

RhetTbull commented 2 years ago

@oPromessa I've added the --profile to export in v0.44.11. There are a number of hidden options and commands I use for debugging. These can now be exposed by setting the environment variable OSXPHOTOS_SHOW_HIDDEN=1, for example, this can be done on command line as:

OSXPHOTOS_SHOW_HIDDEN=1 osxphotos help export to see hidden options for export

or

OSXPHOTOS_SHOW_HIDDEN=1 osxphotos help to see hidden commands

RhetTbull commented 2 years ago

I found the cause of the thousands of unnecessary sqlite calls to the ExportDB code (see #603). I'll work on that later this week which should cause a bit of a speed up as thousands of sqlite database instances won't need to be needlessly recreated.

RhetTbull commented 2 years ago

@oPromessa I fixed the unneeded sqlite calls. Try version v0.44.13 next time you export photos with the --profile flag. On my quick testing, most of the time is spent in the list_directory function used to list files for avoiding name conflicts. I can imagine that over a NAS this will be even slower. One possible fix would be to create a virtual filesystem cache (read the entire export directory once at startup) and compare names to the cache. This would only work if no other process was writing to the export directory at the same time but might give a big speed boost.

Profiling...
Profile sort_stats order: ('tottime',)
Exporting 3856 photos to /Users/rhet/Desktop/export...

Processed: 3856 photos, exported: 372, missing: 3507, error: 0
Elapsed time: 90.059 seconds
Profiling completed
         15173615 function calls (14805239 primitive calls) in 111.012 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     3869   26.692    0.007   40.776    0.011 utils.py:307(list_directory)
        1   10.551   10.551   20.583   20.583 photosdb.py:1618(_process_database5)
   886150    8.688    0.000    9.427    0.000 _convenience.py:131(<lambda>)
        1    3.972    3.972    6.486    6.486 _photosdb_process_searchinfo.py:37(_process_searchinfo)
     3869    3.387    0.001   12.814    0.003 utils.py:315(<listcomp>)
     3870    2.412    0.001    2.412    0.001 {built-in method posix.mkdir}
    11616    2.263    0.000    3.665    0.000 utils.py:285(normalize_fs_path)
     2629    1.741    0.001    1.741    0.001 {method 'execute' of 'sqlite3.Cursor' objects}
     3869    1.728    0.000   52.752    0.014 utils.py:292(findfiles)
      373    1.637    0.004    1.637    0.004 {method 'commit' of 'sqlite3.Connection' objects}
    13390    1.587    0.000    1.659    0.000 {built-in method posix.stat}
    15836    1.567    0.000    1.567    0.000 photoexporter.py:217(__init__)
    15498    1.441    0.000    1.441    0.000 __init__.py:77(__enter__)
7771/3886    1.267    0.000    2.791    0.001 sre_parse.py:493(_parse)
     3870    1.231    0.000    1.231    0.000 {built-in method posix.rmdir}
  2621683    1.222    0.000    1.222    0.000 {built-in method builtins.isinstance}
   899003    1.155    0.000    1.155    0.000 {method 'match' of 're.Pattern' objects}

Edit: I'm overthinking this. In most cases osxphotos doesn't need to know what's in a directory, only whether a file exists so I think this can be sped up by just using is.path.exists()

oPromessa commented 2 years ago

I will run it later tonight with the main 65K photos library. But meanwhile did a test with 88 pics:

I ran a full initial export of about with --profile and tottime for sorting. I attach two files of two runs with the same config but:

from local SSD exporting to the local SSD: so_ssd.log
from local SSD exporting to theNAS Disk (100Mbps): so_nas.log

RhetTbull commented 2 years ago

Edit: I'm overthinking this. In most cases osxphotos doesn't need to know what's in a directory, only whether a file exists so I think this can be sped up by just using is.path.exists()

I tried this and immediately realized why I implemented the collision code the way I originally did. What osxphotos currently does is increment the filename if any file stem matches. e.g. if the image being exported is named "IMG_1234.JPG" and there's already an "IMG_1234.PNG" in the destination folder, osxphotos will export the image as "IMG_1234 (1).JPG". This is done to avoid collisions with side car files which in this case might both be named "IMG_1234.xml" for example.

RhetTbull commented 2 years ago

Looking at your logs, the majority of the time is spend on {method 'readline' of '_io.BufferedReader' objects} which is called by the exiftool code as this is how osxphotos communicates with the exiftool subprocess via a pipe. The NAS is 4X slower (not surprising). One potential speedup is to delay copying the file to the NAS until after exiftool processes it but that makes the export code more complicated.

           ncalls  tottime  percall  cumtime  percall filename:lineno(function)
ssd:    43548   34.903    0.001   34.903    0.001 {method 'readline' of '_io.BufferedReader' objects}
nas:    43548  127.876    0.003  127.876    0.003 {method 'readline' of '_io.BufferedReader' objects}

oPromessa commented 2 years ago

I believe it confirms your comments. Attached is the profile.69k.log profile.log run of 69K pics with 1.8K exported pics.

RhetTbull commented 2 years ago

Thanks for the profile data. I think I can cut this time (hopefully significantly) by processing the exiftool updates on the local machine then copying the file to the NAS. I've just finished a series of refactoring that will allow me to insert this into the code fairly easily and have the basic design for this change done.

I've also done some thinking on a basic design for multiprocessing but will need to do some research and testing about sharing data between processes. Currently, the entire Photos database gets processed and stored in a PhotosDB instance. In its current design, this object is not compatible with pickle which means it can't be shared between processes. It's also a large object so not sure you'd want to pass it around between processes. But the export code all depends on access to PhotoInfo objects which in turn access the PhotoDB object.

oPromessa commented 2 years ago

I was wondering on this flow:

- download the photo if missing
- copy the photo
- create sidecars if needed
- update with exiftool if needed
- process next photo

...if the exiftool processing could be done in-memory (of course it may be an issue for large multi gigs videos -- but hey it would be up to the OS to do some swapping to disk itself) in Step 2.:

- download the photo if missing
- load the photo in-memory and update with exiftool if needed
- write with exiftool to the destination file: aka copy the photo
- create sidecars if needed
- process next photo

Read a bit about exiftool programming options but I confess I'm no expert ;)

Does this make sense?
Does exiftool allow one to make this operation in one go: reading from a local file, making changes to metadata, and writing it into the destination file?
- of course many of the protections and checks (like name duplicates) osxphotos does in the copy step may not be possible with an "exiftool copy process".

RhetTbull commented 2 years ago

Your first flow is basically what happens today.

I'm almost done with a new flow that

stages a temporary copy of the photo (either from the local database or downloads from iCloud)
converts to jpeg if necessary
runs exiftool if necessary
copies the temporary file to the export location

This will avoid sending any file to the export location until it's "done" and avoid any unnecessary reads or writes from/to the export location.

if the exiftool processing could be done in-memory

It can't. The approach I implemented creates a temporary copy of the file, but assuming you're on APFS file system (Catalina+) and the Photos library is on the main disk (the most likely use case), then osxphotos takes advantage of APFS copy-on-write so the temporary file actually takes no time and the writing to disk doesn't actually happen until the conversion or exiftool update. I thought about using a RAM disk (old school style) but it's almost always better to let the OS manage memory and with APFS copy-on-write, it's just as fast.

Does exiftool allow one to make this operation in one go

It does and this would offer a bit of a speedup as the updates wouldn't have to be written to local disk then copied to the NAS but then it would be hard to do things like --retry or catch errors. It would also mean that it some cases the file is copied by osxphotos and in others copied by exiftool which makes the code more complex.

Once I finish this next round of refactoring on the exiftool bits, I have an idea for parallel processing. I think true multi-processing will be hard due to the complex nature of the data structures that need to be passed around the code to make that work. However, I think that multiple copies of osxphotos could run in parallel, all using the same export database. I've got an idea for an "export runner" mode that would collect UUIDs of all the photos that need to be exported then parse those up to multiple osxphotos export processes to manage and spawn those as separate processes. Would be some memory overhead of reading in the database to multiple processes but you'd get speed up from running in parallel. Some complexities with --report, etc. but those are probably workable.

RhetTbull commented 2 years ago

Initial results from new version 0.45.0. This version doesn't copy the photo to destination until after exiftool (and jpeg conversion) is complete. Following results are for exporting 100 photos to a network drive. With --exiftool the improvement is about 66% (exactly what would be expected from cutting out 2/3 of the transfer over the network). With both --exiftool and --convert-to-jpeg, the speed up is about 50% (I'm sure one limiting factor is my old Macbook).

osxphotos version 0.44.13

exporting 100 photos with --exiftool

time osxphotos export /Volumes/rhet/Public/export --uuid-from-file ~/Dropbox/Code/osxphotos/not_missing_uuid.txt --exiftool --exportdb /private/tmp/osxphotos.db
Using last opened Photos library: /Users/rhet/Pictures/Photos Library.photoslibrary
Warning: export database '/private/tmp/osxphotos.db' is in a different directory than export destination '/Volumes/rhet/Public/export'
Exporting 100 photos to /Volumes/rhet/Public/export...
  [####################################]  100%
Processed: 100 photos, exported: 115, missing: 0, error: 1
Elapsed time: 372.266 seconds

export 100 photos with --exiftool and --convert-to-jpeg

time osxphotos export /Volumes/rhet/Public/export --uuid-from-file ~/Dropbox/Code/osxphotos/not_missing_uuid.txt --exiftool --exportdb /private/tmp/osxphotos.db --convert-to-jpeg
Using last opened Photos library: /Users/rhet/Pictures/Photos Library.photoslibrary
Warning: export database '/private/tmp/osxphotos.db' is in a different directory than export destination '/Volumes/rhet/Public/export'
Exporting 100 photos to /Volumes/rhet/Public/export...
  [####################################]  100%
Processed: 100 photos, exported: 115, missing: 0, error: 0
Elapsed time: 527.868 seconds

osxphotos version 0.45.0

export 100 photos with --exiftool

time python -m osxphotos export /Volumes/rhet/Public/export --uuid-from-file ~/Dropbox/Code/osxphotos/not_missing_uuid.txt --exiftool --exportdb /private/tmp/osxphotos.db
Using last opened Photos library: /Users/rhet/Pictures/Photos Library.photoslibrary
Warning: export database '/private/tmp/osxphotos.db' is in a different directory than export destination '/Volumes/rhet/Public/export'
Exporting 100 photos to /Volumes/rhet/Public/export...
  [####################################]  100%
Processed: 100 photos, exported: 115, missing: 0, error: 1
Elapsed time: 137.268 seconds

export 100 photos with --exiftool and --convert-to-jpeg

time python -m osxphotos export /Volumes/rhet/Public/export --uuid-from-file ~/Dropbox/Code/osxphotos/not_missing_uuid.txt --exiftool --exportdb /private/tmp/osxphotos.db --convert-to-jpeg
Using last opened Photos library: /Users/rhet/Pictures/Photos Library.photoslibrary
Warning: export database '/private/tmp/osxphotos.db' is in a different directory than export destination '/Volumes/rhet/Public/export'
Exporting 100 photos to /Volumes/rhet/Public/export...
  [####################################]  100%
Processed: 100 photos, exported: 115, missing: 0, error: 0
Elapsed time: 247.882 seconds

RhetTbull commented 2 years ago

I tried running two osxphotos exports at the same time, splitting the UUIDs between them. For the --exiftool scenario above, this reduced the run time from 137 seconds to 102 seconds so this may be a viable solution to get some improvement. osxphotos could have a mode that opens a pipe to each process to monitor overall progress (and perhaps a special option for export that outputs progress in json format for easy consumption by the monitoring process).

Exporting 50 photos to /Volumes/rhet/Public/export...

Processed: 50 photos, exported: 57, missing: 0, error: 1
Elapsed time: 102.416 seconds

Exporting 50 photos to /Volumes/rhet/Public/export...

Processed: 50 photos, exported: 58, missing: 0, error: 0
Elapsed time: 101.908 seconds

oPromessa commented 2 years ago

This is amazing!!! Will try the new version and provide feedback.

oPromessa commented 2 years ago

Super results in my 88 pics testing.

Remark: the .osxphotos database file is on its default location.

Updated performance table:

osxphotos version	Run	Pics	SSD	NAS (100)	NAS (1G)	Sec/Pic SSD	Sec/Pic NAS (100)	Sec/Pic NAS (1G)
0.44.13	1st	88	37.158	183.095	99.971	0.422	2.080	1.136
0.44.13	2nd	88	17.630	19.160	19.797	0.200	0.218	0.224
0.45.0	1st	88	-	94.290	53.304	-	1.071	0.605
0.45.0	2nd	88	0.576	6.922	3.774	0.006	0.078	0.042

One thing though, I timed the execution of the 2nd run on NAS(100) with time and it does reference a different value from Elapsed. The first is measuring the total time and the second is possibly measuring the export phase only.

$ time bash runit_nas.bash 

real    0m42.515s
user    0m21.059s
sys 0m4.283s
(venv) 
$  egrep "Processed|Elapsed" so.log
Processed: 88 photos, exported: 0, updated: 0, skipped: 88, updated EXIF data: 0, missing: 0, error: 0, touched date: 0
Elapsed time: 6.922 seconds

A. NAS (100) 1st run profile

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  179   45.571    0.255   45.584    0.255 fileutil.py:93(copy)
43548   35.576    0.001   35.576    0.001 {method 'readline' of '_io.BufferedReader' objects}
    1   15.064   15.064   33.609   33.609 photosdb.py:1618(_process_database5)
    1    8.832    8.832   13.845   13.845 _photosdb_process_searchinfo.py:37(_process_searchinfo)

B. NAS (1G) 1st run profile

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
43548   35.254    0.001   35.254    0.001 {method 'readline' of '_io.BufferedReader' objects}
    1   14.393   14.393   32.979   32.979 photosdb.py:1618(_process_database5)
    1    8.863    8.863   13.914   13.914 _photosdb_process_searchinfo.py:37(_process_searchinfo)
  179    7.023    0.039    7.035    0.039 fileutil.py:93(copy)

5.Will run now the 80K file upload/update on to the NAS(100).

RhetTbull commented 2 years ago

Great!

One thing though, I timed the execution of the 2nd run on NAS(100) with time and it does reference a different value from Elapsed.

Yes, the elapsed time only shows the time from when osxphotos begins exporting photos, not any of the setup time. Perhaps I should change that to avoid confusion?

RhetTbull commented 2 years ago

@allcontributors add @oPromessa for ideas, test

allcontributors[bot] commented 2 years ago

@RhetTbull

@oPromessa already contributed before to ideas, test

allcontributors[bot] commented 2 years ago

@RhetTbull

I've put up a pull request to add @oPromessa! :tada:

RhetTbull commented 2 years ago

So after a lot of hacking, I think I have multiprocessing sort of working. There were way too many things in the PhotosDB() object that couldn't be pickled (a requirement for multiprocessing) and I didn't want to refactor it completely so I made a compromise. I settled for a two pass approach:

If in single process mode (normal), export photos as normal
If in multi-process mode, run the first part of the export code to collect all photos that need to be exported
Create a list of all UUIDs that need to be exported and break it into chunks based on desired processes
Spawn separate processes and give each a chunk of UUIDs to process
Each new export process has to reload the Photos database (the second pass) and recreate the PhotoInfo objects but then can export in parallel

On my 2013 MacBook Pro (2.4 GHz Dual-Core Intel Core i5), this does appear to provide a modest improvement in export speed:

Single Process

time python -m osxphotos export ~/Desktop/export --uuid-from-file uuid.txt --exiftool --convert-to-jpeg --report single_proc.csv
Using last opened Photos library: /Users/rhet/Pictures/Photos Library.photoslibrary
Exporting 595 photos to /Users/rhet/Desktop/export...
  [####################################]  595/595
Processed: 595 photos, exported: 601, missing: 32, error: 0
Elapsed time: 0:03:12
python -m osxphotos export ~/Desktop/export --uuid-from-file uuid.txt      96.87s user 30.02s system 58% cpu 3:38.15 total

Two processes

time python -m osxphotos export ~/Desktop/export --uuid-from-file uuid.txt --exiftool --convert-to-jpeg --report multi_proc.csv --multiprocess 2
Using last opened Photos library: /Users/rhet/Pictures/Photos Library.photoslibrary
Exporting 595 photos to /Users/rhet/Desktop/export...
Starting process number #0
Starting process number #1
Process 0 is done
Process 1 is done
All processes finished
Processed: 595 photos, exported: 583, missing: 1, error: 0
Elapsed time: 0:02:28
python -m osxphotos export ~/Desktop/export --uuid-from-file uuid.txt      2  126.20s user 39.10s system 97% cpu 2:49.35 total

You'll notice the total exported is different between the two runs so I need to figure out the difference by examining the export reports but I think this I'm getting close to getting this working correctly.

Update: I fixed the discrepancy between number of photos.

RhetTbull commented 2 years ago

Success!

Screen-Recording-2022-01-30-at-9 59 14-PM

oPromessa commented 2 years ago

Let me know how I can help in the testing of this. As I believe the branch you're using is not accessible to me!

RhetTbull commented 2 years ago

I pushed the branch to Github: https://github.com/RhetTbull/osxphotos/tree/multiprocessing

Happy for any feedback or code contributions if you're feeling inclined. It's a rough alpha at the moment--got it working but need to clean up the code quite a bit (most of the changes are in cli.py:

https://github.com/RhetTbull/osxphotos/blob/79dcfb38a8c067c63a36fd9462ba4b58c2385674/osxphotos/cli.py#L1966

The biggest issue I'm aware of at the moment is that there's a race condition for filename if multiple processes are trying to export a file with the same name to the same location at once. I have a solution in mind but not yet implemented: "reserve" the file with an empty filename preceded by a .. e.g. if file being exported is IMG_1234.jpg, export code would create an empty file named .IMG_1234.jpg to reserve the filename then remove it upon successful export.

Because each of the processes need to load the entire Photos database this is likely to only improve performance for exports where there are a large number of files. It's likely that performance with --convert-to-jpeg and/or --exiftool may improve as these are CPU-bound and not I/O-bound.

oPromessa commented 2 years ago

Results with multiprocessing under rows "0.45.3 mp=x".

there is definitely an overhead of processing the database on each process. As you mentioned it's probably only worthwhile for large number of files being exported.
on the run with mp=8 it was rather strange as the progress bars for all processes were complete but it took quite sometime wrapping up to present All processes finished.

will keep testing

osxphotos version	Run	Pics	SSD	NAS (100)	NAS (1G)	Sec/Pic SSD	Sec/Pic NAS (100)	Sec/Pic NAS (1G)
0.44.13	1st	88	37.158	183.095	99.971	0.422	2.080	1.136
0.44.13	2nd	88	17.630	19.160	19.797	0.200	0.218	0.224
0.45.0	1st	88	-	94.290	53.304	-	1.071	0.605
0.45.0	2nd	88	0.576	6.922	3.774	0.006	0.078	0.042
0.45.3	1st	88	-	-	52	-	-	0.590
0.45.3 mp=2	1st	88	-	-	52	-	-	0.590
0.45.3 mp=4	1st	88	-	-	53	-	-	0.602
0.45.3 mp=8	1st	88	-	-	104	-	-	1.181
0.45.3 mp=4	1st	203	65	-	91	0.320	-	0.448

RhetTbull commented 2 years ago

Thanks for the data. I'm not surprised that 0.45.x is faster as it's avoiding unnecessary round trips to the NAS so this is in family with what I'd expect. The multiprocessing data is interesting as it indicates that either the overhead of loading the library is dominant or the export is I/O bound. It's probably worth instrumenting the multiprocessing code such that it reports time metrics on how long it took to load the library and how long it spent in export. Unless multiprocessing shows a huge improvement for a large export, I'm not sure it's worth spending too much time on the multiprocessing feature.

on the run with mp=8 it was rather strange as the progress bars for all processes were complete but it took quite sometime wrapping up to present All processes finished

Interesting -- I'll take a look and see if I can figure out what's going on. This is the very first python multiprocessing code I've written so quite possible I've done something wrong or inefficient. The progress bar (uses rich.progress) should exit when all tasks have reported done or when no other processes are active (there are two nested loops):

https://github.com/RhetTbull/osxphotos/blob/79dcfb38a8c067c63a36fd9462ba4b58c2385674/osxphotos/cli.py#L2040-L2043

oPromessa commented 2 years ago

Ran a test with a larger set of pics: 1304 Multiprocessing now shows its potential.

But you were right that some conflict could exist on duplicate filenames. This test dataset is also tricky as it is the one with multiple pics with the same filename on the same album. Masked some names with (...)

2: ('Writing metadata with exiftool for AAD019 (5).jpg',)
(...)
Error exporting photo (51E8424E-627C-49DD-ADA4-08FFAAC8DD82: AAD019.jpg) as AAD019.jpg: Error copying file /var/folders/28/x8vbdy4j4vn680b5pt33hs0w0000gn/T/osxphotos_photo_exporter_51E8424E-627C-49DD-ADA4-08FFAAC8DD82_2kcq2cxg/AAD019.jpg to /Volumes/photo/Test.Album/(...)/AAD019 (4).jpg: Error Domain=NSCocoaErrorDomain Code=516 "“AAD019.jpg” couldn’t be copied to “(...)” because an item with the same name already exists." UserInfo={NSSourceFilePathErrorKey=/var/folders/28/x8vbdy4j4vn680b5pt33hs0w0000gn/T/osxphotos_photo_exporter_51E8424E-627C-49DD-ADA4-08FFAAC8DD82_2kcq2cxg/AAD019.jpg, NSUserStringVariant=(
    Copy
), NSDestinationFilePath=/Volumes/photo/Test.Album/(...)/AAD019 (4).jpg, NSFilePath=/var/folders/28/x8vbdy4j4vn680b5pt33hs0w0000gn/T/osxphotos_photo_exporter_51E8424E-627C-49DD-ADA4-08FFAAC8DD82_2kcq2cxg/AAD019.jpg, NSUnderlyingError=0x600001e3c510 {Error Domain=NSPOSIXErrorDomain Code=17 "File exists"}} (photoexporter.py: 1127)
Retrying export for photo (51E8424E-627C-49DD-ADA4-08FFAAC8DD82: AAD019.jpg)

osxphotos version	Run	Pics	SSD	NAS (100)	NAS (1G)	Sec/Pic SSD	Sec/Pic NAS (100)	Sec/Pic NAS (1G)
0.45.3	1st	1304	-	-	672	-	-	0.515
0.45.3 mp=4	1st	1304	-	-	290	-	-	0.222

RhetTbull commented 2 years ago

But you were right that some conflict could exist on duplicate filenames.

I've made some changes to the multiprocessing branch to use lock files to effectively "reserve" a filename which should minimize if not eliminate file name conflicts. When you get a chance, I'd appreciate you re-running the larger export set to see if you still get name conflicts.

oPromessa commented 2 years ago

Some quick feedback:

No name collision! YAY!
a bit slower with the locking mechanism of course but still faster then single process.
Some WARNING messages on database locked.
Oddly enough process 1: took longer than others.
- Was it handlilng larger files ? (must go through the logs to double check).
- like, processes 0, 2 and 3 reached the total 326 (1034/4) while process 1 was still at 78!

osxphotos version	Run	Pics	SSD	NAS (100)	NAS (1G)	Sec/Pic SSD	Sec/Pic NAS (100)	Sec/Pic NAS (1G)
0.45.3	1st	1304	-	-	672	-	-	0.515
0.45.3 mp=4	1st	1304	-	-	290	-	-	0.222
0.45.8 mp=4	1st	1304	-	-	413	-	-	0.316

RhetTbull commented 2 years ago

Ugh...that's a lot slower(~42%) with the lock files. The lock file code is ugly and could result in stray lock files if the app crashes, etc. I may rip all that out--toying with another solution: start a local process that listens on a socket (e.g. just a mini web server with an API) and communicate over the socket to "claim" a filename. This would likely be much faster than creating and checking for existence of lock files over the NAS and no possibility of residual lock files. This adds some complexity as I wouldn't want to mess with the server if in single process mode (where there's no risk of collision).

oPromessa commented 2 years ago

Maybe you're more comfortable with a mini web server +API, but wouldn't a control process and then use Queues and locks from multiprocessing do the trick?

Update: I now got your point to reserve/lock the filename on the filesystem. Other than that yes, it would need a placeholder to reserve the filenames in memory but then ensure that upon writing them there is no conflict. Hmmm. considerable task! ;)

RhetTbull commented 2 years ago

Maybe you're more comfortable with a mini web server +API, but wouldn't a control process and then use Queues and locks from multiprocessing do the trick?

Yes, I'm just not familiar with multiprocess programming and find a server/client easy to understand. I'll read some more about the multiprocessing package -- maybe a lock that could be acquired by the file naming method could work.

However, one consideration is I want to minimize the places in the code that have to be "aware" of multiprocessing to not further complicate the code. Currently, only a couple of methods in cli.py need to deal with multiprocessing. The actual code that does the export (in PhotoExporter class) doesn't need to know whether or not it's running in a single process or several. Adding things like locks or interprocess queues means I have to pass that state around and then python starts to get really whiny about which things can be pickled and which can't.

One possible solution is the dependency injection method I use for --dry-run. The PhotoExporter class gets an instance of an ExportDB class which handles the export database and a FileUtil class which handles things like copy(), move(), unlink(), etc. If in dry run mode, PhotoExporter gets no-op versions of those classes so all the export logic runs just as it would in the actual export but nothing actually gets exported. I suppose the FileUtil class could be modified to add a FileUtilMultiProcess version which took care of the file renaming by acquiring a lock. I'll have to think about this and read up on using locks.

Other than that yes, it would need a placeholder to reserve the filenames in memory

I think this would suffice if all file names were obtained through a common method that kept track of which names had been given out. If this was done with a lock, it could be guaranteed that only a single instance of a name was ever handed out.

oPromessa commented 2 years ago

Re-ran some tests for NAS copy with version `0.47.1`

with the .osxphotos_export.db on the remote NAS server: we know it's slow! A bit slower than 0.45.3
with the --ramdb option we have a whopping performance

osxphotos version	Run	Pics	SSD	NAS (100)	NAS (1G)	Sec/Pic SSD	Sec/Pic NAS (100)	Sec/Pic NAS (1G)
0.45.3	1st	1304	-	-	672	-	-	0.515
0.45.8 mp=4 (locking)	1st	1304	-	-	413	-	-	0.316
0.47.1	1st	1304	-	-	700	-	-	0.536
0.47.1 --ramdb	1st	1304	-	-	390	-	-	0.299

Will see if I can merge `multiprocessing` with `0.47.1 master` branches to run some more tests!

`0.47.1` profile results

Profiling completed
         89654371 function calls (78067164 primitive calls) in 771.205 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   227172  211.915    0.001  211.915    0.001 {method 'readline' of '_io.BufferedReader' objects}
     5219  195.555    0.037  195.555    0.037 {method 'commit' of 'sqlite3.Connection' objects}
    13504  123.162    0.009  123.162    0.009 {method 'execute' of 'sqlite3.Cursor' objects}
     2611  104.748    0.040  105.405    0.040 fileutil.py:93(copy) 
     3147   35.444    0.011   35.444    0.011 {built-in method posix.listdir}
        1   14.478   14.478   31.781   31.781 photosdb.py:1576(_process_database5)
    22867   10.413    0.000   10.447    0.000 {built-in method posix.stat}
4614293/92738    8.221    0.000   26.330    0.000 __init__.py:234(parse)
        1    7.820    7.820   12.533   12.533 _photosdb_process_searchinfo.py:37(_process_searchinfo)
     2608    6.324    0.002    6.324    0.002 {built-in method posix.utime}
2674454/92738    3.647    0.000   25.533    0.000 __init__.py:353(_parse)
851611/92737    2.817    0.000    6.027    0.000 model.py:477(process_node)
     4348    2.662    0.001    2.662    0.001 {method 'fetchone' of 'sqlite3.Cursor' objects}
     3910    2.142    0.001  214.343    0.055 exiftool.py:305(run_commands)
  4386876    2.062    0.000    3.342    0.000 utils.py:418(normalize_unicode)
2440290/1500458    1.960    0.000    2.734    0.000 __init__.py:108(flatten)
1760147/1759614    1.771    0.000    8.185    0.000 __init__.py:759(parse)

`0.47.1 --ramdb` profile results

Profiling completed
         89609322 function calls (77959587 primitive calls) in 467.076 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   220397  251.014    0.001  251.014    0.001 {method 'readline' of '_io.BufferedReader' objects}
     2527   74.567    0.030   75.771    0.030 fileutil.py:93(copy)
     2781   25.982    0.009   25.982    0.009 {built-in method posix.listdir}
        1   14.858   14.858   32.100   32.100 photosdb.py:1576(_process_database5)
     1766   14.523    0.008   14.523    0.008 {built-in method posix.unlink}
    23482   10.171    0.000   10.204    0.000 {built-in method posix.stat}
4639493/93242    8.086    0.000   26.028    0.000 __init__.py:234(parse)
        1    7.781    7.781   12.435   12.435 _photosdb_process_searchinfo.py:37(_process_searchinfo)
     2524    4.165    0.002    4.165    0.002 {built-in method posix.utime}
        2    4.134    2.067    4.134    2.067 {method 'executescript' of 'sqlite3.Cursor' objects}
2689070/93242    3.563    0.000   25.232    0.000 __init__.py:353(_parse)
856147/93241    2.822    0.000    5.956    0.000 model.py:477(process_node)
     3742    2.097    0.001  253.388    0.068 exiftool.py:305(run_commands)
  4386876    2.068    0.000    3.332    0.000 utils.py:418(normalize_unicode)
2453394/1508522    1.938    0.000    2.684    0.000 __init__.py:108(flatten)
1769723/1769190    1.790    0.000    8.178    0.000 __init__.py:759(parse)

RhetTbull commented 2 years ago

Great! The export code refactor in 0.46 requires more read/writes to the database so not surprised it's a bit slower but it will be much more robust.

Will see if I can merge multiprocessing with 0.47.1 master branches to run some more tests!

That may be difficult as I've refactored bug sections of the affected code. It might be easier to add the multiprocessor code back in manually.

It might be worth adding a checkpoint option that writes the export data base from RAM periodically.

RhetTbull commented 2 years ago

By the way, I don't think --ramdb will be compatible with multiprocessing as each process would get a copy of the database.

RhetTbull commented 1 year ago

Though I have way too many projects at the moment, I had an idea last night and I think I have a new design that could resurrect multiprocessing without too much rework and without the cost of the previous design which had to re-read the database inside each process.

The primary limitation for multiprocessing at the moment is the fact that the PhotosDB and PhotoInfo objects can't be serialized and sent between processes. It would be possible to create a new PhotoInfo subclass that instead of reading the database, reads a JSON serialization of a PhotoInfo object. The ability to export to JSON is already inbuilt in the PhotoInfo.json(), need to add the ability to instantiate a PhotoInfo object from the JSON then the JSON could be passed between processes (or just written to a temp SQLite database that all processes could read) and all the rest of the code would work as expected.

RhetTbull / osxphotos