RhetTbull / osxphotos

Python app to work with pictures and associated metadata from Apple Photos on macOS. Also includes a package to provide programmatic access to the Photos library, pictures, and metadata.
MIT License
2.17k stars 100 forks source link

Add --fix-file-extension to export #382

Open RhetTbull opened 3 years ago

RhetTbull commented 3 years ago

See #336 and #381

Photos and external editing apps can apply the incorrect extension. Photos doesn't complain. It's also possible to import a photo with incorrect extension and Photos displays it fine but gets the UTI wrong. It would be good to have an option to fix the extension on export. I think the most reliable way to do this is use exiftool to get the filetype (see comments on #381).

>>> from osxphotos.exiftool import ExifTool
>>> exif = ExifTool("/Users/rhet/Pictures/Test-10.16.0.1.photoslibrary/originals/D/D05A5FE3-15FB-49A1-A15D-AB3DA6F8B068.dng")
>>> exif.asdict()["File:FileTypeExtension"]
'DNG'
RhetTbull commented 3 years ago

Should there be an option to do this only for original images or only for edited images? The problem can happen on import (originals) or more often on edit but a lot easier to just apply the fix to all images.

The would be good with a cache for exiftool #325

RhetTbull commented 3 years ago

For this to work with with --download-missing and --use-photos-export, the check will need to occur after the file is exported to the export directory. This could result in the file having a different name which would then make the --update logic think the file was missing resulting in the file being re-downloaded. To avoid this, the following may work:

PetrochukM commented 1 year ago

I love this idea! I am running into this issue now, and I am trying to figure out a way to fix the file extensions before running an export, so that everything goes smoothly.

RhetTbull commented 1 year ago

@PetrochukM how many photos do you have with this problem? Are you intending to do a 1-time export or a recurring export with --update? If this is a one-time export or it's a small number of photos, you could use --post-function to call a custom function which would examine the extension and the file type and rename the file if necessary. This would be relatively simple to implement. The downside is that this couldn't be used with --cleanup and if you used --update, the files will get re-exported with each subsequent export as the name won't match what's in the export database. This feature is on my to-do list but it's pretty far down the list so will be a while before I get to it.

If you save the following as fix_export_extension.py and run export by adding the flags:

----post-function fix_export_extension.py::fix_extension

it should rename the files with incorrect extension. It doesn't check for name collisions and there may some extensions where there's more than one valid extension (for example, .jpg, .jpeg) in which case

""" Example function for use with osxphotos export --post-function option """

import pathlib
from typing import Callable

from osxphotos import ExportResults, PhotoInfo
from osxphotos.exiftool import ExifTool

def fix_extension(
    photo: PhotoInfo, results: ExportResults, verbose: Callable, **kwargs
):
    """Call this with osxphotos export /path/to/export --post-function fix_export_extension.py::fix_extension
        This will get called immediately after the photo has been exported

    See full example here: https://github.com/RhetTbull/osxphotos/blob/master/examples/post_function.py

    Args:
        photo: PhotoInfo instance for the photo that's just been exported
        results: ExportResults instance with information about the files associated with the exported photo
        verbose: A function to print verbose output if --verbose is set; if --verbose is not set, acts as a no-op (nothing gets printed)
        **kwargs: reserved for future use; recommend you include **kwargs so your function still works if additional arguments are added in future versions

    Notes:
        Use verbose(str) instead of print if you want your function to conditionally output text depending on --verbose flag
        Any string printed with verbose that contains "warning" or "error" (case-insensitive) will be printed with the appropriate warning or error color
        Will not be called if --dry-run flag is enabled
        Will be called immediately after export and before any --post-command commands are executed
    """

    for filepath in results.exported:
        filepath = pathlib.Path(filepath)
        ext = filepath.suffix.lower()
        if not ext:
            continue
        ext = ext[1:]  # remove leading dot
        exiftool = ExifTool(filepath)
        actual_ext = exiftool.asdict().get("File:FileTypeExtension").lower()
        if ext != actual_ext and (ext not in ("jpg", "jpeg") or actual_ext != "jpg"):
            # WARNING: Does not check for name collisions; left as an exercise for the reader
            verbose(f"Fixing extension for {filepath} from {ext} to {actual_ext}")
            new_filepath = filepath.with_suffix(f".{actual_ext}")
            verbose(f"Renaming {filepath} to {new_filepath}")
            filepath.rename(new_filepath)
PetrochukM commented 1 year ago

I have 30k photos or so and had a couple of hundred issues. I had several issues with NEF, PNG, and JPEG files. I have been collecting my photos from across the internet, so I intend on doing reoccurring exports as I collect more photos! I have been manually fixing these issues, downloading the images, renaming them, and uploading them. These issues have been coming up regularly, especially with Facebook exports and some iPhone screenshots.

Unfortunately, Apple Photos sometimes bugs or crashes when dealing with these types of files. That has made it more challenging to resolve this issue. Today, I needed to find the original files because I could not export the original files from Apple Photos without crashing.

Thanks for providing the script, I'll try it out!

If I use it with --cleanup, it'll just re-do this every time. I am okay with that. It's only a couple hundred photos, and this should be a pretty quick operation to rename the files, yeah?

oPromessa commented 1 year ago

Hi @PetrochukM

What I do is to try and fix as many problems (extension, EXIF date times, quick time date times, conversions/rotate videos) prior to uploading them to Photos so that osxphotos can then apply it's magic.

PLEASE NOTE MOST OF THESE COMMANDS USE THE -overwrite_original_in_place WHICH WRITES OVER THE FILE ITSELF. SO DO TEST IT OUT AND SAVE COPIES OF THE ORIGINALS, just in case!!!

###############################################################################
#
# EXIFTOOL related commands
#
###############################################################################

#==============================================================================
# exif() Displays some key EXIF tags from files/directories recursively.
#------------------------------------------------------------------------------
function exif() {
    # To overcome incompatibility of xattr (loaded by osxphotos).
    # Force PATH to force xattr to be sourced from /usr/bin
    # *** Adapt/change the location of the txt file
    PATH=/usr/bin:$PATH exiftool -r -d """%Y:%m:%d %H:%M:%S""" -p """/Users/YourUser/format.txt""" -f "$@"
}
export -f exif
#------------------------------------------------------------------------------

#==============================================================================
# exiforiginaldate() Copies the original date from EXIF to all other dates in the file.  To align all.
#------------------------------------------------------------------------------
function exiforiginaldate () {
    echo Performing... """-CreateDate\<DateTimeOriginal""" """-FileModifyDate\<DateOriginalDate"""
    # 2017.04.13 Included changing the ModifyDate tag (operating system level, which could also be set via touch command)
    exiftool -d """%Y:%m:%d %H:%M:%S""" -overwrite_original_in_place -fileOrder DateTimeOriginal """-CreateDate<DateTimeOriginal""" """-FileModifyDate<DateTimeOriginal""" """-ModifyDate<DateTimeOriginal""" -v "$@"
}
export -f exiforiginaldate
#------------------------------------------------------------------------------

#==============================================================================
# exifKeysdate() Copies Keys:Creation date into OriginalDate. Some videos (QT) have this as the correct date and don't have OriginalDate and/or FileModify sate is wrong.
#------------------------------------------------------------------------------
function exifKeysdate () {
    echo Performing... ""-DateTimeOriginal\<Keys:CreationDate"" 
    exiftool -overwrite_original_in_place """-DateTimeOriginal<Keys:CreationDate""" -P -v "$@"
}
export -f exifKeysdate
#------------------------------------------------------------------------------

#==============================================================================
# exifpng2jpg() # Command which convert .PNG files into .JPG and copies EXIF fields ...
#------------------------------------------------------------------------------
function exifpng2jpg() {
# Command which convert .PNG files into .JPG and copies EXIF fields ...

    for a in "$@"
    do
        fname=`basename "$a" .png`
        xtension=png
        echo 1st -  ${fname}.${xtension}

        if [ \( -f "${fname}".png -o -f "${fname}".PNG \) -a ! -f "${fname}".jpg  ]
        then
            echo ok -  ${fname}.${xtension}
            sips -s format jpeg -s formatOptions 90 """${fname}".${xtension}"" --out """${fname}".jpg""
            exiftool -overwrite_original_in_place -TagsFromFile "${fname}".${xtension} "-all:all>all:all" "${fname}".jpg
        else
            fname=`basename "$a" .PNG`
            xtension=PNG
            echo 2nd -  ${fname}.${xtension}

            if [ \( -f "${fname}".png -o -f "${fname}".PNG \) -a ! -f "${fname}".jpg  ]
            then
                echo ok -  ${fname}.${xtension}
                sips -s format jpeg -s formatOptions 90 """${fname}".${xtension}"" --out """${fname}".jpg""
                exiftool -overwrite_original_in_place -TagsFromFile "${fname}".${xtension} "-all:all>all:all" "${fname}".jpg
            fi
        fi  
    done
}  
export -f exifpng2jpg
#------------------------------------------------------------------------------

#==============================================================================
# exiftif2jpg() # Command which convert .TIF files into .JPG and copies EXIF fields ... Uses ModifyDate as the source for DateTimeOriginal, FileModifyDate, CreateDate fields
#------------------------------------------------------------------------------
function exiftif2jpg() {
# Command which convert .TIF files into .JPG and copies EXIF fields ... Uses ModifyDate as the source for DateTimeOriginal, FileModifyDate, CreateDate fields

    for a in "$@"
    do
        fname=`basename "$a" .tif`
        dname=`dirname "$a" `
        if [ -f "${dname}/${fname}".tif -a ! -f "${dname}/${fname}".jpg  ]
        then
            echo ok -  ${fname}
            sips -s format jpeg -s formatOptions 100 """${dname}/${fname}".tif"" --out """${dname}/${fname}".jpg""
            exiftool -overwrite_original_in_place -TagsFromFile "${dname}/${fname}".tif "-all:all>all:all" "${dname}/${fname}".jpg
            # WHy am I using ModifyDate and referencing FileModifyDate?
            echo Performing... ""-DateTimeOriginal\<FileModifyDate"" ""-CreateDate\<FileModifyDate"" on "${dname}/${fname}".jpg
            exiftool -overwrite_original_in_place """-DateTimeOriginal<ModifyDate""" """-CreateDate<ModifyDate""" """-FileModifyDate<ModifyDate""" -P -v "${dname}/${fname}".jpg
        fi  
    done
}  
export -f exiftif2jpg
#------------------------------------------------------------------------------

#==============================================================================
# exifsetdate()Takes a parameters in the format %Y:%m:%d %H:%M:%S and adjust the Original date of file.
#------------------------------------------------------------------------------
function exifsetdate() {

    datetoset=${1}; echo $datetoset
    shift
    exiftool -d """%Y:%m:%d %H:%M:%S""" -overwrite_original_in_place -fileOrder DateTimeOriginal -DateTimeOriginal="""${datetoset}"""  -v "$@"
}
export -f exifsetdate
#------------------------------------------------------------------------------

#==============================================================================
# exifsetdate()Takes a parameters in the format %Y:%m:%d %H:%M:%S and adjust the ALL dates of file.
#------------------------------------------------------------------------------
function exifsetalldates() {

    datetoset=${1}; echo $datetoset
    shift
    exiftool -d """%Y:%m:%d %H:%M:%S""" -overwrite_original_in_place -fileOrder DateTimeOriginal -DateTimeOriginal="""${datetoset}""" -CreateDate="""${datetoset}""" -FileModifyDate="""${datetoset}""" -ModifyDate="""${datetoset}""" -v "$@"
}
export -f exifsetalldates
#------------------------------------------------------------------------------

#==============================================================================
# exifrenamefile() Rename the file based on it's Original date. File name format will be: IMG_%Y%m%d_%H%M%S%%-c.%%e and keeping the same extension.
#------------------------------------------------------------------------------
function exifrenamefile() {
    #exiftool -v4 -r -d """%Y-%m-%d %H.%M.%S%%-c.%%le""" '-filename<DateTimeOriginal' -f "$@"
    # %%le lowers the case of extension. Hopefully %%e keeps the same extension
    exiftool -v4 -r -d """IMG_%Y%m%d_%H%M%S%%-c.%%e""" '-filename<DateTimeOriginal' -f "$@"
    echo "Note: Do you need to run exiforiginaldate?"
}
export -f exifrenamefile

#------------------------------------------------------------------------------

#==============================================================================
# exifcopytags()
#   - srcfile
#   - dstfile
# Copy all metadata from one file to another
# exiftool -TagsFromFile srcimage.jpg "-all:all>all:all" targetimage.jpg
# USE -overwrite_original_in_place in applicable
# USE WITHOUT all:all will copy everything including GPS data (works with MP4)
#------------------------------------------------------------------------------
function exifcopytags() {
E_BADARGS=85
E_BADFILES=90

    if [ ! -n "$1" -o ! -n "$2" ]
    then
        echo "Usage: exifcopytags srcfile dstfile"
        return $E_BADARGS
    elif [ -f "$1" ]
    then
        srcfile=${1}; shift

        echo Copying all tags from "${srcfile}" to "$@"
        exiftool -v -overwrite_original_in_place -TagsFromFile "${srcfile}" "${@}"
    else
        echo "Usage: file(s) not found!"
        echo "Usage: exifcopytags srcfile dstfile"
        return $E_BADFILES
    fi  
}
export -f exifcopytags
#------------------------------------------------------------------------------

#==============================================================================
# exifcleantags()
#   - files
# Claen all metadata from one file
# exiftool -overwrite_original -all= -gps:all= *.jpg
# USE -overwrite_original_in_place in applicable
#------------------------------------------------------------------------------
function exifcleantags() {
E_BADARGS=85
E_BADFILES=90

    if [ ! -n "$1" ]
    then
        echo "Usage: exifcleantags files"
        return $E_BADARGS
    elif [ -f "$1" ]
    then
        echo Cleaning all tags from "$@"
        exiftool -v -overwrite_original -all= -gps:all= "${@}"
    else
        echo "Usage: file(s) not found!"
        echo "Usage: exifcleantags files"
        return $E_BADFILES
    fi  
}
export -f exifcleantags
#------------------------------------------------------------------------------
PetrochukM commented 1 year ago

Thanks for sending that. I am not sure I am ready to copy all my photos and run that over them. I think it's small enough that I don't want to take that risk (:

That said, I did learn from this that EXIF might be hiding metadata that could be useful to me! I have many mystery files that I am struggling to pin down the creation date for.

RhetTbull commented 1 year ago

@PetrochukM I've added a script find_bad_extensions.py to the examples directory. This will scan your Photos library to find all photos that have a bad extension. It caches the results so when you re-run it, it doesn't have to scan each photo again.

If you save the file at the link above to find_bad_extensions.py you can run with osxphotos via: osxphotos run find_bad_extensions.py. You can see the help (pasted below) using osxphotos run find_bad_extensions.py --help. Info on files with bad extensions is output as CSV format to STDOUT so you can do this to capture the results: osxphotos run find_bad_extensions.py > results.csv.

Usage: osxphotos run find_bad_extensions.py [OPTIONS]

  Scan Photos library to find photos with bad (incorrect) file extensions.

  This can be run with osxphotos via: `osxphotos run find_bad_extensions.py`

  Both STDOUT and STDERR are used to output results.

  STDOUT is used to output a CSV file with the following columns:

  uuid, original_filename, version, current_extension, correct_extension, path

  Thus, to save the results to a file, run:

  osxphotos run find_bad_extensions.py > results.csv

Options:
  --library PATH  Path to Photos library to use. Default is to use default
                  Photos library.
  --recheck       Recheck all files even if previously checked and cached.
  --edited        Check edited versions of photos in addition to originals.
  --help          Show this message and exit.

You can use this to find all the bad extensions then you can export those, fix the extension and re-import them if desired. I intend to eventually add this as an osxphotos command to automatically fix the extensions by automating the export, fix, re-import (and re-apply metadata). Ref. #336

RhetTbull commented 1 year ago

I have many mystery files that I am struggling to pin down the creation date for.

You might want to check out osxphotos help timewarp to learn about the timewarp tool which can bulk-adjust the creation date for photos. It can also pull the creation date from EXIF (--pull-exif) and looks at a much wider range of EXIF fields than Photos does.

PetrochukM commented 1 year ago

Thank you so much for this! This should be really helpful!