RhetTbull / osxphotos

Python app to work with pictures and associated metadata from Apple Photos on macOS. Also includes a package to provide programmatic access to the Photos library, pictures, and metadata.
MIT License
2.14k stars 99 forks source link

Add specific UTI/suffix support to --convert-to-jpeg #215

Open neilpa opened 4 years ago

neilpa commented 4 years ago

Would it be feasible to add an option to always convert RAW images on export to JPEG? Any edited photos are already going to be JPEG but I've got a lot of un-edited raws w/out the JPEG preview. This would allow something like the following (totally made up --raw-to-jpeg) to quickly create a backup that is also extremely portable.

osxphotos export --raw-to-jpeg --export-by-date \
  --skip-original-if-edited --edit-suffix "" \
  --no-extended-attributes /Volumes/thumbdrive
RhetTbull commented 4 years ago

@neilpa I've thought about this for .heic (issue #73). For .heic, /usr/bin/sips can do the conversion. I'm not aware of any universal way to do this for various RAW formats that wouldn't require an external app however Photos itself can convert the images. That is in fact what happens when you export an image in Photos without selecting "Export unmodified original".

The code that interacts with Photos for --download-missing option could also be used to force export of the RAW as a jpeg (or I could use my new PhotoScript) library. There are a few downsides to this approach though:

  1. Interacting with Photos is slow -- on my system it takes about 1 sec/photo just to get the photo from Photos and initiate the export not counting time to convert from RAW to jpeg.
  2. Interacting with Photos can be very flaky--it was better on Mojave but under Catalina and Big Sur beta, Photos is prone to crashing when being scripted. The aforementioned PhotoScript package seems to be more reliable because I've built-in retries for cases where I've seen Photos be flaky.
  3. Photos only has two export options: 1) export current version of image as high quality jpeg or 2) export original in original format. This means that you'd always get the edited version of the image if there had been any edits or the original version if there were no edits. You couldn't get a jpeg of the "original" RAW if you'd made edits even though the RAW file technically hadn't changed.

Another possibility would be to add an option to allow the user to run a command of choice (ImageMagick for example) on the exported file after export and provide a way to pass the path of the exported photo as a parameter to the user's command. This could be combined with osxphotos templating system. For example:

--command "/usr/bin/sips -s format jpeg {exported_file} --out {original_basename}.JPG"

Though this latter approach might be better as an altogether issue. Thoughts?

Edit: Thinking some more about this: a con of the --command approach is the resulting converted photo would not be tracked by osxphotos so it would not be considered when running export --updated. Another con is the user would need to install a command line tool like ImageMagick to do the conversion. On the other hand, a benefit would be these commands could possibly be run in a separate thread pool so they could run in parallels to the export and not slow down the actual export from Photos.

neilpa commented 4 years ago

This is the answer I was afraid of.

For .heic, /usr/bin/sips can do the conversion

Interesting, I wasn't aware of this built-in command and had been using the libheif tools

Photos itself can convert the images.

Yea, I had a small hope that this would be available via PhotoKit without having to script the GUI. I can't imagine going that route once you start getting above 1k photos. Exporting of my full library of ~16k photos to a fast external SSD and that takes ~20 minutes currently. Seems unlikely that would go well when invoking Photos for each image given the occassional crashes I already see.

Another possibility would be to add an option to allow the user to run a command of choice

I suspect this would be hard to make work in practice. You'll likely need different commands for heif (e.g. sips) vs raw (e.g. dcraw) files. You could wrap it in a shell script, but then you lose the benefits of the templating system. Maybe there's an ergonomic way to do this on an per-extension (or UTI?) basis. However, I worry the plethora of raw file formats would make this impractical.

the resulting converted photo would not be tracked by osxphotos so it would not be considered when running export --updated

I haven't looked into the mechanics of how --updated works yet. My current thinking (sans builtin conversion) is to do a post-hoc conversion after export, then truncate the raw files to 0 bytes keeping the exported file created/mod times. Would that "trick" osxphotos into not re-exporting images unless they had actually been updated in the Photos app?

RhetTbull commented 4 years ago

I haven't looked into the mechanics of how --updated works yet. My current thinking (sans builtin conversion) is to do a post-hoc conversion after export, then truncate the raw files to 0 bytes keeping the exported file created/mod times. Would that "trick" osxphotos into not re-exporting images unless they had actually been updated in the Photos app?

Unfortunately that would cause osxphotos to re-export the image. For --update, osxphotos looks at the file signature (mtime, size, mode) as well as information stored in .osxphotos_export.db so the zero length RAW would look like a change and thus another export would be triggered.

I would really like to do this via PhotoKit. I'm not competent in Swift or Objective-C but I've played around with using PhotoKit via PyObjC and System Integrity Protection (SIP) gets in the way of doing it reliably. I've gotten it to work on my system but can't reliably make it work repeatedly on other systems. I think it has to do with how the terminal is asking for permission to use Photos -- if this fails the first time, there's an entry in tcc.db and now Terminal never asks again causing all PhotoKit requests to fail. I gave up after many hours of frustration and defaulted to using AppleScript when osxphotos wasn't enough.

If I could crack the code on PhotoKit and SIP, this sort of thing would be much easier. One idea I've toyed with is creating a stand-alone helper utility in Swift that would manage asking permission for SIP and then interact with osxphotos via a pipe (much like I do now with exiftool) to export photos, etc. Something like this would also allow changes to the actual library from Python (editing photos, adding photos, creating albums, etc). Given that I get a few hours a week at most to hack on this project, that's a ways off as I'd have to learn enough Swift to get started.

Here's an example that exports a photo based on UUID. (It exports in native format but could probably get PhotoKit to convert to JPEG).

RhetTbull commented 4 years ago

Ooh...someone's created a python wrapper for dcraw: pyunraw -- that could be called from osxphotos to do the conversion. One wrinkle is it's GPL3 licensed (as is dcraw). I prefer a more permissive license and have kept everything in osxphotos to MIT licensed code or compatible. I'm not a license expert so don't know if importing pyunraw makes all of osxphotos fall under GPL3 but my gut says yes. From a philosophical position I'm not sure I want to go down this path.

Edit: Here's an MIT-licensed python RAW to jpeg converter: rawpy

RhetTbull commented 4 years ago

Another possibility is to use a plug-in system. For example:

pip install osxphotos-plugin-raw-to-jpeg

osxphotos export /path/to/export --plugin raw-to-jpeg

neilpa commented 4 years ago

I hadn't actually used dcraw before and was doing some experiments this morning. I'm not sure it's going to work as well as I had hoped, at least for bulk exports. The default color correction leaves a lot to be desired. I also experimented with various options and got slightly better results. However, it doesn't compare to the default results I get from the Photos app (or something like Darktable). I'm very much a noob at "developing" digital photos so I may be overlooking something obvious.

There's also darktable-cli which I want to test. Maybe that will work better out of the box.

I'm not competent in Swift or Objective-C

It's been awhile, but I've used both a fair bit. I may experiment this afternoon to see what's possible in terms of export.

RhetTbull commented 4 years ago

@neilpa if it's possible to do this without PhotoKit -- e.g. with CoreImage by passing the path to the RAW file, then we wouldn't need to worry about SIP. The advantage of PhotoKit is you can get the image directly by UUID and even if it's in iCloud and not downloaded, PhotoKit will fetch it. On the flip side, I think PhotoKit will only work with the "system library" and many people have more than one Photos library.

neilpa commented 4 years ago

Good call on CoreImage, that makes the conversion nearly trivial. I did a quick test with that approach and it worked great for the raw images I tried.

As for the PhotoKit limitation, it may be possible to use a custom library via the private header initialization methods. I'm one of those folks with a non-standard library and will see if I can get something working here.

RhetTbull commented 4 years ago

@neilpa that StackOverflow link was exactly what I needed....took a bit of fiddling, but I've created a pure Python implementation that uses CoreImage, via PyObjC, to convert RAW to jpeg.

Still needs testing, error handling, etc. but this should do the trick for both heic and RAW...and way better than scripting Photos. Of course, if you're able to develop a PhotoKit interface that would be very useful as well.

See this gist:

# reference: https://stackoverflow.com/questions/59330149/coreimage-ciimage-write-jpg-is-shifting-colors-macos/59334308#59334308
import pathlib

# needed to capture system-level stderr
from wurlitzer import pipes

import Metal
import Quartz
from Cocoa import NSURL
from Foundation import NSDictionary

def export_image_to_jpeg(input_path, output_path, compression_quality=1.0):
    """ export image to jpeg

    Args:
        input_path: path to input image (e.g. '/path/to/import/file.CR2')
        output_path: path to exported jpeg (e.g. '/path/to/export/file.jpeg')
        compression_quality: JPEG compression quality, float in range 0.0 to 1.0; default is 1.0 (best quality)

    Return:
        path to exported JPEG or None if export failed

    Raises:
        ValueError if compression quality not in range 0.0 to 1.0
        FileNotFoundError if input_path doesn't exist
    """
    if not pathlib.Path(input_path).is_file():
        raise FileNotFoundError(f"could not find {input_path}")

    if not (0.0 <= compression_quality <= 1.0):
        raise ValueError("illegal value for compression_quality: {compression_quality}")

    input_url = NSURL.fileURLWithPath_(input_path)
    output_url = NSURL.fileURLWithPath_(output_path)

    with pipes() as (out, err):
        # capture stdout and stderr from system calls 
        # otherwise, Quartz.CIImage.imageWithContentsOfURL_ 
        # prints to stderr something like: 
        # 2020-09-20 20:55:25.538 python[73042:5650492] Creating client/daemon connection: B8FE995E-3F27-47F4-9FA8-559C615FD774
        # 2020-09-20 20:55:25.652 python[73042:5650492] Got the query meta data reply for: com.apple.MobileAsset.RawCamera.Camera, response: 0
        input_image = Quartz.CIImage.imageWithContentsOfURL_(input_url)

    context_options = NSDictionary.dictionaryWithDictionary_(
        {
            "workingColorSpace": Quartz.CoreGraphics.kCGColorSpaceExtendedSRGB,
            "workingFormat": Quartz.kCIFormatRGBAh,
        }
    )
    mtldevice = Metal.MTLCreateSystemDefaultDevice()
    context = Quartz.CIContext.contextWithMTLDevice_options_(
        mtldevice, context_options
    )
    output_colorspace = (
        input_image.colorSpace()
        if input_image.colorSpace()
        else Quartz.CGColorSpaceCreateWithName(
            Quartz.CoreGraphics.kCGColorSpaceSRGB
        )
    )
    output_options = NSDictionary.dictionaryWithDictionary_(
        {"kCGImageDestinationLossyCompressionQuality": compression_quality}
    )
    result, error = context.writeJPEGRepresentationOfImage_toURL_colorSpace_options_error_(
        input_image, output_url, output_colorspace, output_options, None
    )
    if not error:
        return result
    else:
        return None

if __name__ == "__main__":
    import pathlib
    import sys

    if len(sys.argv) != 2:
        sys.exit(f"Usage: {__file__} input_file")

    input_path = sys.argv[1]
    output_path = (
        f"{pathlib.Path(input_path).parent / pathlib.Path(input_path).stem}.jpeg"
    )
    exported = export_image_to_jpeg(input_path, output_path, 0.75)
    if exported is not None:
        print(f"Exported file {input_path} to {output_path}")
    else:
        print(f"Error exporting {input_path} to {output_path}")
RhetTbull commented 4 years ago

Thinking through interface. Maybe something like:

osxphotos export /path/to/export --convert-to-jpeg --jpeg-quality 0.75

This would convert any non-JPEG file (not just RAW, so heic too), allow user to set jpeg conversion quality, and register the converted images in .osxphotos_export.db so --update worked correctly.

Alternatively, could do --raw-to-jpeg and heic-to-jpeg to be more explicit. But there are still other formats like PNG, TIFF, etc. people might want to convert.

Also, some might want to export both the RAW and the converted jpeg and some might want to export only the converted jpeg. Need a way to specify this.

RhetTbull commented 4 years ago

Maybe --skip-original-if-converted as there's already a --skip-original-if-edited

RhetTbull commented 4 years ago

Apparently CIContext is expensive to init -- maybe use a singleton class like I do for ExifTool. Need to do some benchmarking.

neilpa commented 4 years ago

Yea, you definitely want to reuse the context if possible. (There was a similar comment the the SO post I linked).

I think PhotoKit will only work with the "system library" and many people have more than one Photos library.

After spending some time with the private headers I was able to hack together some Objective-C code that successfully loads a photos library from an arbitrary path. I got far enough to fetch a PHAsset via UUID and validate a few properties, including the path on disk is expected.

I need to clean this up a bit but I'll push a proof-of-concept Xcode project and share more of my findings. There's some interesting stuff in the header dump that I want to look at more. In particular, I think I've found a way to extract the adjustment data (at least the crop region along with orientation and rotation).

neilpa commented 4 years ago

Here's my initial Poc Xcode project - https://github.com/neilpa/photohack

If you add a new Xcode scheme with the first argument as a path to a *.photoslibrary and subsequent arguments as asset UUIDs it'll dump adjustment data in a readable format. That should get you something like the following.

{
    Album = PHAssetCollection;
    Asset = PHAsset;
    CloudSharedAlbum = PHCloudSharedAlbum;
    DetectedFace = PHFace;
    DetectedFaceGroup = PHFaceGroup;
    FaceCrop = PHFaceCrop;
    FetchingAlbum = PHAssetCollection;
    Folder = PHCollectionList;
    GenericAsset = PHAsset;
    ImportSession = PHImportSession;
    Keyword = PHKeyword;
    LegacyFaceAlbum = PHAssetCollection;
    Memory = PHMemory;
    Moment = PHMoment;
    MomentList = PHMomentList;
    MomentShare = PHMomentShare;
    MomentShareParticipant = PHMomentShareParticipant;
    Person = PHPerson;
    PhotoStreamAlbum = PHAssetCollection;
    PhotosHighlight = PHPhotosHighlight;
    ProjectAlbum = PHProject;
    Question = PHQuestion;
    Suggestion = PHSuggestion;
}
adjustments: E1C9A934-260F-4211-95AD-14827A82ACB4
  <none>
2020-09-22 20:17:59.046444-0700 photohack[13032:3222620] Metal API Validation Enabled
adjustments: 34B8483F-8A35-4FEB-98A3-84DC376F0C9F
  composition = <NUGenericComposition:0x100457b60 id=me.neilpa.photohack:PhotosComposition~1.0 mediaType=Unknown contents={
    raw = <NUGenericAdjustment:0x100458080> id=me.neilpa.photohack:RAW~1.0 settings={
    auto = 0;
    enabled = 1;
    inputDecoderVersion = 8;
},
    cropStraighten = <NUGenericAdjustment:0x1004582e0> id=me.neilpa.photohack:CropStraighten~1.0 settings={
    angle = "-0";
    auto = 0;
    constraintHeight = 0;
    constraintWidth = 0;
    enabled = 1;
    height = 988;
    width = 1482;
    xOrigin = 2342;
    yOrigin = 1127;
},
    orientation = <NUGenericAdjustment:0x100458610> id=me.neilpa.photohack:Orientation~1.0 settings={
    value = 1;
},
}>

I'll try to get a proper binary built (and updated instructions) but haven't figured out how to properly embed the Info.plist such that it can request photos library permissions.

RhetTbull commented 4 years ago

Got photohack working -- yay! I'll try to learn enough swift to usefully contribute. Now that you've proven this can be done, I've got lots of ideas for extending this. For example, a simple interface that listened on stdin for command and spit output to stdout (perhaps as JSON) would be really useful. You could open a pipe to the CLI with name of photos library as argument then emit commands, for example:

getAssetAdjustments UUID
>>> { JSON adjustment data }
getAssetMetaData UUID
>>> { JSON metadata data }
exportAsset UUID path
>>> { exportPath: path }
deleteAsset UUID
>>> { deleted: True }

Keeping a pipe open would make all this relatively fast...much faster than AppleScript which I use to export images that aren't downloaded from iCloud for example.

Or, the interface could be a local webserver with RESTful API though that would take more work than a simple CLI.

neilpa commented 4 years ago

I did more digging into the adjustment data and seem to have figured out the schema for the full scope of edits that are possible in the Photos app. There's a lot of details in https://github.com/neilpa/photohack/pull/1. I haven't done much testing yet on sample images but would be interesting to run this through the test libraries in this repo.

I also figured out how to make a standalone binary work so I attached that to a release. That should make it a bit easier to experiment with. For now I just have it dump a JSON map with the asset UUID as keys and the adjustment dictionaries as values.

Or, the interface could be a local webserver with RESTful API though that would take more work than a simple CLI.

That would be an interesting approach but I'm not sure how simple it is to do something like this in Swift/Objective-C out of the box. I usually default to Go for CLI apps for this reason, it's trivial to add a web/http interface if needed. In this case I didn't want to add yet another language into the mix.

RhetTbull commented 4 years ago

Great! I'll do some experimenting this weekend with the test libraries and maybe eventually take a stab at writing some tests if I can figure out how to do so in XCode.

RhetTbull commented 4 years ago

I'm close to having the initial implementation done for this. Need to work on the --update code to keep files from being converted to jpeg when nothing's changed. The first version I push out will only have a --convert-to-jpeg option that converts all non-jpeg images on export. I'll then add a --convert-raw-to-jpeg that only converts raw images. Will also need to add an option to set custom jpeg compression level. Current implementation uses full-quality jpeg which is what Photos does.

RhetTbull commented 4 years ago

Status update: This is in work (see convert_to_jpeg branch) but has turned out to be much more difficult than I anticipated.

neilpa commented 4 years ago

the code thus fails in GitHub Actions which use virtual machines w/o GPUs.

It should be possible to detect this (e.g. the MTLDevice creation will fail) and fallback to a CPU-based context.

RhetTbull commented 4 years ago

It should be possible to detect this (e.g. the MTLDevice creation will fail) and fallback to a CPU-based context.

Thanks. I'd looked into that but the documentation for CGBitmapContextCreate scared me off and I wasn't sure it was worth the effort for what's likely an edge case. If I'm wrong and this affects a lot of people, I'll take a stab at figuring out how to create a bitmap context. I configured tests to run locally but skip on Github Actions.

I'm trying to get this feature MVP'd as I've got a couple other projects begging for my attention.

RhetTbull commented 4 years ago

v0.35.0 adds --convert-to-jpeg and --jpeg-quality to osxphotos export. These convert all non-jpeg images, including raw, to jpeg. Will look at also adding a --raw-to-jpeg for just raw images later.

RhetTbull commented 11 months ago

--convert-to-jpeg works for this use case but might be good to be able to specify UTI or file extension to convert (RAW, HEIC, etc.) instead of converting anything that's not a jpeg