RhetTbull / osxphotos

Python app to work with pictures and associated metadata from Apple Photos on macOS. Also includes a package to provide programmatic access to the Photos library, pictures, and metadata.
MIT License
2.17k stars 100 forks source link

--name doesn't work correctly with all unicode characters #594

Closed RhetTbull closed 2 years ago

RhetTbull commented 2 years ago

If name in photos has unicode characters and use --name NAME_WITH_UNICODE on the command line, osxphotos doesn't always find the photos. Need to do the same unicode normalization done for writing files in reverse to search.

To reproduce:

osxphotos export ~/Desktop/export --db tests/Test-10.15.7.photoslibrary --preview --verbose --update --name Frítest

No files will be found even though multiple photos match the name Frítest.jpg

oPromessa commented 2 years ago

Hello there.

Possibly related to this or reminiscent of bug #561 - I have a file name with special character "!" and ""/"?" which gets deleted on the export folder when using options update and cleanup.

Writing metadata with exiftool for /Volumes/photo-1/ele é!.JPG
Updating file modification time for /Volumes/photo-1/ele é!.JPG
Exported new file /Volumes/photo-1/ele é!.JPG
(...)
Touched date on file /Volumes/photo-1/ele é!.JPG

(...)
Deleting /Volumes/photo-1/ele é!?.JPG
RhetTbull commented 2 years ago

I think that's definitely related to #561 so I've reopened this. osxphotos is comparing files on disk to files in the export database and thinks the names are different. Handling special characters has been a difficult thing! There are at least two unicode representations of many of these characters and the one Photos uses is different than the one the filesystem uses. I try to normalize all filesystem paths using Apple's NSString.fileSystemRepresentation (but I see in profile results I just did for #582 that this is one of the slowest parts of the code). osxphotos shouldn't be deleting any photos that were part of the export set so I'll need to figure out how to fix this.

RhetTbull commented 2 years ago

@oPromessa I'm not able to replicate this using your filename. (Likely has to do with the way characters were transcribed from Github markdown to my computer). Could you create a sample image with this name and upload it as a zip?

I tried to copy the name in your sample image to rename a test image both in Finder and in the Terminal but neither gets deleted on export.

Screen Shot 2022-01-23 at 8 05 02 AM
RhetTbull commented 2 years ago

Some interesting reading on the filesystem / unicode normalization issue here and here.

oPromessa commented 2 years ago

I'll have to dig up a bit more on this. From fast reading your links I believe it related to how the filenames are processed in the local Mac filesystem and on the remotely connected via SMB filesystem of the NAS (Synology). Will do a few more tests.

RhetTbull commented 2 years ago

--name also searches both current and original filename. On Photos 5+ this is wrong as the current filename is always the UUID which could contain the character string being searched. e.g.

--name AA found:

Exporting Tulips.jpg (6191423D-8DB8-4D4C-92BE-9BBBA308AAC4.jpeg) as Tulips.jpg (2/5)
Exporting AAF035.jpg (27682111-4F90-4856-A421-B19AA173506A.jpeg) as AAF035.jpg (3/5)