RhetTbull / osxmetadata

Python package to read and write various MacOS extended attribute metadata such as tags/keywords and Finder comments from files. Includes CLI tool for reading/writing metadata.
MIT License
111 stars 2 forks source link

What is the correct way to format keywords aka kMDItemKeywords? #83

Open luckman212 opened 1 year ago

luckman212 commented 1 year ago

I have some apps on my system that I get incorrect output from osxmetadata --get kMDItemKeywords <path>

example, CleanShot X displays each character as an array element:

image

Are these supposed to be strings, arrays, or ... ? What's the correct way to write multiple tags to a file?

osxmetadata --set keywords foo --set keywords bar --set keywords 'baz quux' myApp.app

or

osxmetadata --set keywords "foo,bar,baz quux" myApp.app

or

osxmetadata --set keywords "( foo bar 'baz quux' )" myApp.app

?

RhetTbull commented 1 year ago

Did you set the keywords on Cleanshot X with osxmetadata or were those set by the app or some other app? kMDItemKeywords should be a list of strings.

The intended format for setting keywords is:

osxmetadata --set keywords foo --append keywords bar --append keywords 'baz quux' myApp.app

I'm not at my Mac but will verify this later tonight.

For multi-value attributes, you need to set the value first (overwrite what's there) then append additional values.

luckman212 commented 1 year ago

Thanks for looking. No I did not set that metadata myself, it's what comes out of the box with that particular app. I note that at least for Spotlight searches, it is parsed correctly by mds—even though the format might not be correct or recognized by osxmetadata.

Thanks for the syntax for multiple keywords. I didn't know to use --set keywords foo --append keywords bar. I was doing repeated groups of --set which also somehow does seem to work.

RhetTbull commented 1 year ago

This is very odd.

First, I verified that osxmetadata properly sets multiple keywords (kMDItemKeywords) with repeated --set or --set followed by --append. I will improve the documentation to make it more clear (#84).

$ osxmetadata --set keywords foo --set keywords bar t.txt
$ osxmetadata --get keywords t.txt
keywords                  kMDItemKeywords                   = foo, bar
$ osxmetadata --set keywords foo --append keywords bar t.txt
$ osxmetadata --get keywords t.txt
keywords                  kMDItemKeywords                   = foo, bar

I've also verified that I have some apps installed that osxmetadata improperly displays the keywords for in the same manner as your example (a list of characters instead of a list of keywords). For example:

$ osxmetadata --get kMDItemDisplayName --get keywords /Applications/*.app
displayname               kMDItemDisplayName                = Yoink
keywords                  kMDItemKeywords                   = d, r, a, g, ,, d, r, o, p, ,, m, o, v, e, ,, c, o, p, y, ,, a, l, i, a, s, ,, f, i, l, e, ,, s, h, e, l, f, ,, u, t, i, l, i, t, y, ,, s, h, e, l, v, e, s, ,, d, o, c, u, m, e, n, t, s
displayname               kMDItemDisplayName                = iMazing
keywords                  kMDItemKeywords                   = i, O, S, ,,  , i, P, h, o, n, e, ,,  , i, P, a, d, ,,  , i, P, o, d, ,,  , B, a, c, k, u, p, ,,  , B, a, c, k,  , u, p, ,,  , R, e, s, t, o, r, e, ,,  , T, r, a, n, s, f, e, r, ,,  , E, x, t, r, a, c, t, ,,  , E, x, p, o, r, t, ,,  , C, o, p, y, ,,  , M, a, n, a, g, e, ,,  , A, p, p, s, ,,  , M, e, s, s, a, g, e, ,,  , M, e, s, s, a, g, e, s, ,,  , i, M, e, s, s, a, g, e, ,,  , i, M, e, s, s, a, g, e, s, ,,  , S, M, S, ,,  , M, M, S, ,,  , C, a, l, l,  , l, o, g, ,,  , N, o, t, e, ,,  , N, o, t, e, s, ,,  , C, o, n, t, a, c, t, ,,  , C, o, n, t, a, c, t, s, ,,  , V, o, i, c, e, ,,  , M, e, m, o, ,,  , M, e, m, o, s, ,,  , V, o, i, c, e, m, a, i, l, ,,  , P, h, o, t, o, ,,  , P, h, o, t, o, s, ,,  , P, i, c, t, u, r, e, ,,  , P, i, c, t, u, r, e, s,  , F, i, l, e, ,,  , F, i, l, e, s, ,,  , M, u, s, i, c, ,,  , S, o, n, g, s, ,,  , V, i, d, e, o, ,,  , V, i, d, e, o, s, ,,  , i, B, o, o, k, s, ,,  , D, a, t, a

Every app that had keywords, did this. However, I have also verified that the metadata stored for these apps in kMDItemKeywords does not conform to the Apple documentation for kMDItemKeywords which states:

kMDItemKeywords
Keywords associated with this file. For example, “Birthday”, “Important”, etc. 
An CFArray of CFStrings.

The type is clearly specified as an array of strings but for the apps, the value is stored as a single string with comma delimited keywords.

$ python
Python 3.10.5 (main, Jul 17 2022, 07:22:36) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import CoreServices
>>> mditem = CoreServices.MDItemCreate(None, "/users/rhet/Dropbox/Code/osxmetadata/t.txt")
>>> value = CoreServices.MDItemCopyAttribute(mditem, "kMDItemKeywords")
>>> value
(
    foo,
    bar
)
>>> mditem = CoreServices.MDItemCreate(None, "/Applications/Yoink.app")
>>> value = CoreServices.MDItemCopyAttribute(mditem, "kMDItemKeywords")
>>> value
'drag,drop,move,copy,alias,file,shelf,utility,shelves,documents'
>>>

And though it appears in Finder that Finder is showing the keywords, it's actually showing a single string for the Keywords: value, not individual keywords.

Note that in this example (set by osxmetadata following the Apple documentation) that there are two keywords, separated by a comma and a space:

Screen Shot 2022-11-15 at 8 51 45 PM

And in this example of an app with the single string value, there is no space after the comma:

Screen Shot 2022-11-15 at 8 52 15 PM

However, when searching in Spotlight, both forms are correctly identified. For example, if I search for keyword:foo, Spotlight finds my t.txt example. If I search for keyword:shelves (one of the kMDItemKeywords on Yoink.app), Finder correctly finds the app.

So, it must be that Spotlight works with both kMDItemKeywords as an array of strings or as a single string of comma delimited values. But this is not what the documentation states.

So what should osxmetadata do? I think the best approach is to continue to use an array of strings when setting keywords as the documentation states but to treat any value which is a single string and not a CFArray of CStrings as a comma-delimited list and handle these correctly when retrieving them. I'm open to suggestions though.

RhetTbull commented 1 year ago

@all-contributors please add @luckman212 for bug

allcontributors[bot] commented 1 year ago

@RhetTbull

I've put up a pull request to add @luckman212! :tada:

RhetTbull commented 1 year ago

The mystery deepens. osxmetadata uses a private/undocumented Apple API (MDItemSetAttribute) to set metadata. This method returns False (unsuccessful) when attempting to set kMDItemKeywords to a comma delimited list.

luckman212 commented 1 year ago

I think the best approach is to continue to use an array of strings when setting keywords as the documentation states but to treat any value which is a single string and not a CFArray of CStrings as a comma-delimited list and handle these correctly when retrieving them.

Agree with this. It's definitely odd and not very reassuring, but it does work.

RhetTbull commented 1 year ago

Fixed in v1.2.1

luckman212 commented 1 year ago

Tested and working well! This is great 🚀

luckman212 commented 1 year ago

Last comment:

https://github.com/RhetTbull/osxmetadata/blob/719be3bfad71386fe64079817de7ae1aa37aeda9/osxmetadata/mditem.py#L155

It might be nice to automatically trim the strings returned from this method, so we don't end up with keywords like "__Photos" (_ are spaces)

This could be done with a simple list comprehension

[x.strip() for x in str(value).split(",")]
RhetTbull commented 1 year ago

Thanks, good suggestion. Fixed in v1.2.2

RhetTbull commented 1 year ago

@all-contributors add @luckman212 for code

allcontributors[bot] commented 1 year ago

@RhetTbull

I've put up a pull request to add @luckman212! :tada: