Closed ifarfan closed 2 years ago
I'll take a look at this. This is weird as the code goes to great lengths to preserve special characters and even has a test for this very case (where the album name has diacritics in it). The code that creates export directories sanitizes for illegal path characters and replaces them automatically. The "_" folder is used as a default when the template doesn't match (in this case "{folderalbum}") but in your example, if the photos are in the album, they should not end up in the "" folder.
Reference these test {folder_album} cases:
Thanks!
Also, I didn't mention that said album is also within a few other folders:
Thus the final path would've been: /Volumes/photos/Travel/Europe/Kraków 2016
but instead all the photos ended up in /Volumes/photos/_
Just in case the filename escaping logic trips under a folder hierarchy
I just did a test using your exact scenario and export command and was not able to replicate this.
I created an album called Kraków 2016
in folder tree Travel/Europe
:
Then ran the following export command:
osxphotos export ~/Desktop/export --directory "{folder_album}" \
--exiftool \
--exiftool-option '-m' \
--person-keyword \
--album-keyword \
--skip-original-if-edited \
--update \
--overwrite \
--current-name \
--touch-file \
--retry 2
The resulting Kraków 2016
folder was created as expected:
Please try the following command and post the output here or send it to me at rturnbull+git@gmail.com so I can do some further debugging on your particular scenario.
osxphotos export ~/Desktop/export --album "Kraków 2016" --verbose --directory "{folder_album}" > debug.txt
Replaced the ~/Desktop/export
path with a temporary export path of your choosing.
Maybe this helps: Unfortunately there are many ways to express accented characters. And as I had to learn, OSX and Synology seem not always 100% compatible, even for simple classics like äöüéè etc. and when UTF-8 (the default) is set on both sides.
I found a Python script "nfcfn.py" here on GitHub to clean up ("normalize") these characters, which I've used to clean my UTF encoding to one single working model on my Synology NAS.
Commands like ls -d * | od -tax1
(display folder contents as raw hex bytes) had initially helped me debug and identify these issues between DSM and OSX (and iOS).
Background is that UTF-8 offers multiple ways of expressing the same unicode characters, and MacOS and Synology seem to differ in details. E.g. the German character "ü" can be encoded OSX-style as a 'u' followed by the two bytes "\xCC
" and "\x88
". These two bytes together make up the UTF-8 representation of \u0308
, the "combining diaeresis" [i.e. two dots above the preceding character, which is called „Kombinierendes Trema“ in German].
So as a result, we have at least two ways to express umlaut ‚ü‘ in UTF-8:
0x75 0xCC 0x88
0xC3 0xBC
ü
is being used(BTW the most extreme case has been PhotoSync, my favorite app to export photos from iOS, where as default, "favorite" flags can optionally be exported as single character ❤️ in the filename => and if exported via FTP to my Synology — to note, all latest versions, all with default settings — the ❤️ arrives in the filenames as hex "c3 a2 c2 9d c2 a4 c3 af c2 b8 c2 8f" => 12 Bytes for a single unicode character — which DSM then has not been able to interpret correctly).
@ifarfan I've not been able to replicate this bug. If you try the following command and post the output here or send it to me at rturnbull+git@gmail.com so I can do some further debugging on your particular scenario that would be helpful.
osxphotos export ~/Desktop/export --album "Kraków 2016" --verbose --directory "{folder_album}" > debug.txt
Replaced the ~/Desktop/export path with a temporary export path of your choosing.
@ifarfan if you're still using osxphotos are you still having problems with this issue? I've made some changes to Unicode handling in osxphotos recently that might help. Try running with the latest version and let me know if you still have issues.
@RhetTbull thanks for the follow up! I manually renamed all the folders with foreign accents and haven't had issues again, it might've been an invalid hidden character and/or a linux-to-mac unicode hiccup during a folder rename/copy
I'm good now 👍
@RhetTbull, I ran into this issue today when exporting via osxphotos and compare the result with the Original folder, I imported to Apple Photos years ago.
Original Import Folder
JotMac:Scripts jotzet$ echo -n "1999-02 - Bundesheer Lilienfeld D-Brückenbau" | od -A n -t x1
31 39 39 39 2d 30 32 20 2d 20 42 75 6e 64 65 73
68 65 65 72 20 4c 69 6c 69 65 6e 66 65 6c 64 20
44 2d 42 72 75 cc 88 63 6b 65 6e 62 61 75
OSXPhotos Export Folder
JotMac:Scripts jotzet$ echo -n "1999-02 - Bundesheer Lilienfeld D-Brückenbau" | od -A n -t x1
31 39 39 39 2d 30 32 20 2d 20 42 75 6e 64 65 73
68 65 65 72 20 4c 69 6c 69 65 6e 66 65 6c 64 20
44 2d 42 72 c3 bc 63 6b 65 6e 62 61 75
I also set the Locale Shell variable export LC_ALL="de_AT.UTF-8"
and tried to do the report, but I had no luck.
Will look now into nfcfn.py as proposed by @ubrandes.
Thank you, Joachim
BTW, always those guys dealing with the german (or polish in that case) umlauts ;-) #208
Albums names was Unicode 'normalized' on osxphotos import
by Rhetbull in one of the latest versions #1475 see also #1085 (with a very complete description on the wonders of "Unicode characters can take one of 4 different normalization forms: NFC, NFD, NKFC, NKFD)."
So I'd guess now:
But somehow not aligned with the original folder name in the file system from the moment it was imported.
@jotzet79 unicode is always tricky to deal with and it's entirely possible there's a bug in OSXPhotos. Here's what OSXPhotos does at the moment:
When comparing text, rendering templates, writing data to Photos (e.g. creating albums), etc., OSXPhotos always converts to NFC formatted unicode. This is consistent with what macOS does.
However, when creating filenames and directories, OSXPhotos will convert to NFD format if on macOS, otherwise NFC if on linux. This is consistent with how the default behavior of the two operating systems.
That means the album name in Photos may be different than the folder name on disk though visually they will be the same. Internally they would use 2 different unicode encodings.
I've considered in the past adding a unicode template that would convert text to a given encoding. For example:
{unicode.nfc:{folder_album}}
or {folder_album|unicode.nfc}
.
Internally this would take a fair bit of work because the template system normalizes everything. Another option is to specify the "internal" unicode format and the "external (on disk)" unicode format via options. This is much easier to implement as osxphotos already contains methods to globally adjust this in the code. For example:
osxphotos export --directory {folder_album} --unicode-filesystem NFC --unicode-internal NFD
I'll open a new issue for this.
@oPromessa to answer your questions inline...
- Are you in the latest osxphotos?
Yes, absolutely
JotMac:Pictures jotzet$ osxphotos --version
osxphotos, version 0.68.1
Python 3.12.3 (main, Apr 9 2024, 16:54:45) [Clang 14.0.0 (clang-1400.0.29.202)]
macOS 13.6.7, x86_64
How did you do the import and name the album?
- via direct import on Photos
This: Long time ago I did photos management via Files'n'Folders. During Corona all my legacy pics (=non Smartphone) were geotagged, and then I imported all my folders (=albums) via DnD to Apple Photos.
- via osxphotos import?
Nope, as this didn't exist back then.
- did you name the album in Photos itself ?
Nope, the naming stems from the initial folders on the filesystem
Albums names was Unicode 'normalized' on
osxphotos import
by Rhetbull in one of the latest versions #1475 see also #1085 (with a very complete description on the wonders of "Unicode characters can take one of 4 different normalization forms: NFC, NFD, NKFC, NKFD)."
Yep, I also already wrote a folder "translation" script via python in the meanwhile using unicodedata.html
So I'd guess now:
- exported album names are correct.
Copy/Paste album name from Apple Photos to rename a folder: b'/Users/jotzet/Pictures/Export/2000/1999-02 - Bundesheer Lilienfeld D-Bru\xcc\x88ckenbau'
Creating album manually with 'ü' char results in an osxphotos export of: b'/Users/jotzet/Pictures/Export/1999/\xc3\xbc'
Creating folder manually with 'ü' char b'/Users/jotzet/Pictures/Export/1999/u\xcc\x88'
- If you osxphotos import this folder and export it: it should also be correct.
Yet to be verified...
But somehow not aligned with the original folder name in the file system from the moment it was imported.
@jotzet79 unicode is always tricky to deal with and it's entirely possible there's a bug in OSXPhotos. Here's what OSXPhotos does at the moment:
When comparing text, rendering templates, writing data to Photos (e.g. creating albums), etc., OSXPhotos always converts to NFC formatted unicode. This is consistent with what macOS does.
However, when creating filenames and directories, OSXPhotos will convert to NFD format if on macOS, otherwise NFC if on linux. This is consistent with how the default behavior of the two operating systems.
That means the album name in Photos may be different than the folder name on disk though visually they will be the same. Internally they would use 2 different unicode encodings.
I've considered in the past adding a unicode template that would convert text to a given encoding. For example:
{unicode.nfc:{folder_album}}
or{folder_album|unicode.nfc}
.Internally this would take a fair bit of work because the template system normalizes everything. Another option is to specify the "internal" unicode format and the "external (on disk)" unicode format via options. This is much easier to implement as osxphotos already contains methods to globally adjust this in the code. For example:
osxphotos export --directory {folder_album} --unicode-filesystem NFC --unicode-internal NFD
I'll open a new issue for this.
@RhetTbull: Thank you for your prompt response!
Oddly when I tested inputting data to create albums and folders anew, it resulted in NFD based unicode representation always (see above). But honestly I might be completely wrong: this char representation things and encodings are really driving me nuts... 😃
But be also aware that I really don't consider this issue as high priority - its an edge case probably, and others don't have this problem anyway.
Again, I really enjoy using osxphotos (especially the inspect feature is really sexy) - Thank you!
Kind regards, Joachim
PS: If you are really "freaky enough" , you might try out macOS's Keyboard Viewer in combination with German (or even Austrian) Keyboard Settings to be able to reproduce this mess. Maybe other locale settings behave differently, who knows?
Hey guys,
While doing an export of all my albums I noticed that an Album named
Kraków 2016
was being exported as_
So instead of creating
/Volumes/photos/Kraków 2016
it was creating this/Volumes/photos/_
This is the command I issued:
Once I removed the accented character
osxphotos
was able to correctly create the album folder.In addition, if I issue a
osxphotos albums
, the albumKraków 2016
shows up correctly.Only issue I can think of is that I'm using a Synology NAS for my externally mounted volume, but I haven't had any issues creating/updating files/folders from my Mac using accented characters (and I use them all over the place).
Amazing tool, btw!