Dijji / FileMeta

Enable Explorer in Vista, Windows 7 and later to see, edit and search on tags and other metadata for any file type
Microsoft Public License
766 stars 68 forks source link

Metadata Name questions #61

Open jc508 opened 5 years ago

jc508 commented 5 years ago

Hi, I think this tool will do nicely for what we want but a couple of questions please. Background: We have somewhere between 35,000 and 60,000 files, mostly jpgs to 'Manage' So far windows explorer has been used but only for file and folders.
so the folder path has often been used to house extra metadata with the result that there are thousands of duplicates for example one jpg has been filed under \Shared_Pictures\Euroa book ALL\1985 photos \Shared_Pictures\Newspapers\1985 photos \Shared_Pictures\Euroa book ALL\EUROA jul 2007\1985 photos and about 4 other places.

A bit of thinking about this and the type of metadata in common includes

So to the questions: Can we define new metadata items such as 'Place' or are we restricted to the pool of Properties already in existence ? Can we change the display name for an attribute ? Already I see the metadata name an item System.Keywords but explorer shows this as 'Tags'

and/or is there any other advice anybody could give?

Thanks JC

Dijji commented 5 years ago

Hi JC

It is possible to define new properties. It involves creating a fairly complicated definition file, and then invoking a Windows API to register it. However, I've never seen a program to help you do it, or heard of anybody doing it.

There are a couple of reasons for this. Firstly, the set of predefined properties is actually pretty comprehensive. Secondly, it turns out that for searching purposes you don't really need to use specific properties: putting all the values you want in as tags (or Keywords) is typically fine. In general, meta data values don't clash that much, apart from dates, and there are already a lot of good date properties.

Search can be invoked just by typing into the search box in the top right of Explorer, and will pick up on meta data values (assuming you added your folder tree to the list of folders to be indexed). You can also narrow your search using Boolean operators like AND, or look for specific properties, for example using keywords:=whatever

As far as I know, you cannot change the display names of already defined properties.

I have encountered similar problems with my own archives, and after various experiments I settled on a simple folder hierarchy based on date taken, with all the other information in tags.

One variation retained the folder structure to help me find things, but eliminated the duplication by using hard links. These are a mechanism in NTFS that allows a single physical file to appear in multiple places in the folder structure of a single hard drive. Search works better.

Another approach rebuilt what looked like a folder structure based on the tag values i.e. files appeared in a virtual folder if and only if they had the corresponding tag. This also had an idea of folder hierarchy, so that street folders appeared within town folders, et cetera. Again, it turned out that I didn't even use this myself, so I didn't publish it.

A program that did turn out to be more useful took my old folder structure and weeded out duplicates.

In your case, something that converted your old folder structure to a new one, preserving information by converting folder names to properties such as tags might be very useful. I don't how you are at programming, but a script in a language such as Powershall should be able to handle this.

Whatever, yours is a problem close to my heart, so feel free to ask for further clarification on anything here that is obscure.

Dijji

jc508 commented 5 years ago

Dijji Thanks for the feedback; we must be 'attuned' as I think I have ended up intuiting and/or adopting all of your suggestions except the links.

Once I found an actual list of the properties that I could search I settled on the following existing attributes

There was also a bit of trial and error seeing some attributes are editable in explorer and some are not and I haven't found a lst of these. eg System.ItemParticipants and System.Photo.PeopleNames do not seem to be editable but System.Contact.LastName is

I am choosing to have (aiming for) all the files to be uniquely named in the ONE folder. This will necessitate a fair bit of process/ habit change as now they cant just dump a batch of IMG_* files in a folder where ALL the context is hidden in the folder path. For migration of the existing, as you say, programatically generating the metadata from existing folder names into appropriate attributes as above.

Its ambitious but I am pretty much there. It took 3 aborted strategies before I have one looks like it will work. From the starting 81,221 files in 1,478 folders I am down to 57,050 uniquely named files. There are still 7,800 duplicates here but they can wait until stage 2.

The tool set ended up using some bits that are generally beyond the home user

I have details of all the specific steps and things to think about if anybody else is deep in the same situation.

Thanks again JC