RhetTbull / osxmetadata

Python package to read and write various MacOS extended attribute metadata such as tags/keywords and Finder comments from files. Includes CLI tool for reading/writing metadata.
MIT License
117 stars 2 forks source link

osxmetadata does not have the same output as mdls #60

Closed jakewilliami closed 1 year ago

jakewilliami commented 2 years ago

E.g.

$ osxmetadata -l ~/.bashrc
/Users/jakewilliami/.bashrc:

$ mdls ~/.bashrc
kMDItemFSContentChangeDate = 2022-07-11 03:30:37 +0000
kMDItemFSCreationDate      = 2022-07-11 03:21:16 +0000
kMDItemFSCreatorCode       = ""
kMDItemFSFinderFlags       = 0
kMDItemFSHasCustomIcon     = 0
kMDItemFSInvisible         = 1
kMDItemFSIsExtensionHidden = 0
kMDItemFSIsStationery      = 0
kMDItemFSLabel             = 0
kMDItemFSName              = ".bashrc"
kMDItemFSNodeCount         = 7116
kMDItemFSOwnerGroupID      = 20
kMDItemFSOwnerUserID       = 501
kMDItemFSSize              = 7116
kMDItemFSTypeCode          = ""
RhetTbull commented 2 years ago

Hi @jakewilliami I want to make sure I understand the issue. osxmetadata is not a replacement for mdls. As stated in the docs osxmetadata only supports specific metadata attributes that are set via extended attributes. What did you expect to see your example output? In your example, the file doesn't appear to have any extended attributes which is why osxmetadata doesn't show any output. The metadata shown in this case by mdls are derived from the filesystem, not via extended attributes.

jakewilliami commented 2 years ago

@RhetTbull, my bad then, I must have misread the documentation. I read

Apple provides rich support for file metadata through various metadata extended attributes. MacOS provides tools to view and set these various metadata attributes. For example, mdls lists metadata associated with a file but doesn't let you edit the data while xattr allows the user to set extended attributes but requires the values be in the form of a MacOS plist which is impractical. osxmetadata makes it easy to to both view and manipulate the MacOS metadata attributes, either programmatically or through a command line tool.

(bold is mine), so I thought that meant that osxmetadata allows you to see mdls and xattr info.

In particular, I was hoping to use osxmetadata to view the metadata attributes of the file, such as those items beginning with kMDItem.

RhetTbull commented 2 years ago

Perhaps the docs should say "osxmetadata makes it easy to to both view and manipulate some MacOS metadata..."

You can read/write many of the kMDItem attributes...just not all of them. I built osxmetadata to allow me to read/write metadata attributes most useful for searching with spotlight like kMDItemSubject, kMDItemTitle etc. and it does let you do many of those.

I have another project, autofile that does read mdls data and many other attributes. It currently is for a specific purpose (automatically filing files based on their metadata) but it includes a wrapper around mdls that might be helpful. I do eventually plan to pull the metadata engine out of autofile to make a stand-alone library for accessing all sorts of metadata.

RhetTbull commented 2 years ago

In particular, I was hoping to use osxmetadata to view the metadata attributes of the file, such as those items beginning with kMDItem.

Which kMDItem* attributes did you want to use that osxmetadata doesn't provide?

jakewilliami commented 2 years ago

Which kMDItem* attributes did you want to use that osxmetadata doesn't provide?

osxmetadata doesn't provide any of the kMDItemFS* attributes that I listed using mdls (see above output compared to mdls). The kMDItem I wanted was kMDItemContentTypeTree.

I was actually looking at your repository to get inspiration for how to get kMDItemContentTypeTree in another language (Julia). I ended up writing a solution myself using Julia's FFI.

but [autofile] includes a wrapper around mdls

I specifically wanted to avoid calling out to mdls, in the interest of efficiency.

I'm not sure if you can use my code as a basis for your Python codebase, as I am unsure what Python's FFI is like, but it would be only a slightly different function call; where I use MDItemCopyAttribute to get a single attribute, you'll probably need MDItemCopyAttributeNames and MDItemCopyAttributes to collect all attributes.

EDIT: implementing the C calls to edit the attributes is a different problem that I have not dealt with.

RhetTbull commented 2 years ago

EDIT: implementing the C calls to edit the attributes is a different problem that I have not dealt with

Editing the attributes is my primary motivation/use case for osxmetadata. Specifically, those extended attributes that are useful with Spotlight. osxmetadata was built to work with the extended attributes so I could use Spotlight and Finder smart folders more effectively.

I'll take a look at your Julia implementation but I don't have a burning need for kMDItemContentTypeTree. Python can call Objective-C directly via a bridge so calling MDItemCopyAttributes should be easy. But again, what I really want to do is edit the extended attributes (which xattr allows but requires translation to/from plist or other binary formats that Apple uses).

jakewilliami commented 2 years ago

If your primary use case is editing these attributes, then I think you shouldn't edit things like kMDItemContentTreeType, as that is set by the system and I can't see a use case for setting this.

Cool that Python can call Objective-C directly. That would probably save you having to implement CFString*, CFArray*, and CFDictionary* methods yourself.

RhetTbull commented 1 year ago

@jakewilliami I've release version 1.0.0 of osxmetadata that fixes this issue and several others. It's a complete rewrite to use the native macOS calls to get/set metadata. It does change the API in breaking ways though so check out the README.md. It uses MDItemCopyAttribute to get metadata (and similar NSURL methods to get NSURL metadata) and an undocumented MDItemSetAttribute method to set the metadata. You can now get the kMDItemContentTypeTree like this:

>>> from osxmetadata import *
>>> md = OSXMetaData("test_file.txt")
>>> md.kMDItemContentTypeTree
['public.plain-text', 'public.text', 'public.data', 'public.item', 'public.content']
RhetTbull commented 1 year ago

@all-contributors add @jakewilliami for ideas

allcontributors[bot] commented 1 year ago

@RhetTbull

I've put up a pull request to add @jakewilliami! :tada:

jakewilliami commented 1 year ago

This looks great @RhetTbull! I’m on holiday at the minute but will return next week, upon which I will review the MR :-)