novoid / filetags

Management of simple tags within file names
GNU General Public License v3.0
256 stars 37 forks source link
file files pim shell tagging tags

+BEGIN_HTML

+END_HTML

[[file:bin/screencast.gif]]

... filetags example demonstrating: controlled vocabulary file =~/.filetags=, tagging multiple files at once, removing tags by prepending a minus character, tagging using the proposed number shortcuts, tab completion of tags via =Tab=, and mutually exclusive tags (switching from =draft= to =final= without removing =draft=).

This Python script adds or removes tags to file names in the following form:

The script accepts an arbitrary number of files (see your shell for possible length limitations).

[[https://imgs.xkcd.com/comics/trained_a_neural_net.png]] (Source: [[https://xkcd.com/2173/][xkcd]])

** Why

Besides the fact that I am using [[https://en.wikipedia.org/wiki/Iso_date][ISO dates and times]] in file names (as shown in examples above), I am using tags with file names. To separate tags from the file name, I am using the separator "space dash dash space".

For people familiar with [[https://en.wikipedia.org/wiki/Regex][Regular Expressions]]:

: (<ISO date/time stamp>)?(?)?( -- )?.

Tagging files this way requires a file renaming process. Adding (or removing) tag(s) to a set of file results in multiple renaming processes. Despite advanced renaming tools like vidir (from [[http://joeyh.name/code/moreutils/][moreutils]]) it's handy to have a tool that makes adding and removing tags as simple as possible.

You may like to add this tool to your image or file manager of choice. I added mine to [[http://geeqie.sourceforge.net/][geeqie]] which is my favorite image viewer on GNU/Linux.

Here is [[https://glt18-programm.linuxtage.at/events/321.html][a 45 minute talk I gave]] at [[https://glt18.linuxtage.at/][Linuxtage Graz 2018]] presenting the idea of and workflows related to filetags and other handy tools for file management:

[[https://media.ccc.de/v/GLT18_-_321_-_en_-_g_ap147_004_-_201804281550_-_the_advantages_of_file_name_conventions_and_tagging_-_karl_voit/][bin/2018-05-06 filetags demo slide for video preview with video button -- screenshots.png]]

** Installation

This tool needs [[http://www.python.org/downloads/][Python 3 to be installed]].

You can install filetags either via [[https://packaging.python.org/tutorials/installing-packages/][pip]] which is the recommended way. Or you can install filetags using the source code, e.g., by cloning the [[https://github.com/novoid/filetags/][GitHub repository of filetags]].

*** Installation Via Pip

If you have installed Python 2 and Python 3 in parallel, make sure to use the correct pip version. You might need to use =pip3= instead of =pip=. If you only have Python 3 installed, you don't have to care ;-)

On Microsoft Windows (only), you are going to need ~pip install pypiwin32~ as prerequisite. For easy Windows File Explorer integration, take a look at [[https://github.com/novoid/integratethis][integratethis]].

Now install filetags via [[https://pip.pypa.io/en/stable/][pip]]: ~pip install filetags~

You get updates by executing the very same pip command again.

*** Installation Via Source Code

If you use the GitHub sources (and not pip):

** Usage

+BEGIN_SRC sh :results output :wrap src

./filetags/init.py --help | sed 'sX/home/vkX\$HOMEX'

+END_SRC

+BEGIN_src

usage: ./filetags/init.py [-h] [-t "STRING WITH TAGS"] [--remove] [-i] [-R] [-s] [--hardlinks] [-f] [--filebrowser PATH_TO_FILEBROWSER] [--tagtrees] [--tagtrees-handle-no-tag "treeroot" | "ignore" | "FOLDERNAME"] [--tagtrees-link-missing-mutual-tagged-items] [--tagtrees-dir ] [--tagtrees-depth TAGTREES_DEPTH] [--ln] [--la] [--lu] [--tag-gardening] [-v] [-q] [--version] [FILE [FILE ...]]

This tool adds or removes simple tags to/from file names.

Tags within file names are placed between the actual file name and the file extension, separated with " -- ". Multiple tags are separated with " ": Update for the Boss -- projectA presentation.pptx 2013-05-16T15.31.42 Error message -- screenshot projectB.png

This easy to use tag system has a drawback: for tagging a larger set of files with the same tag, you have to rename each file separately. With this tool, this only requires one step.

Example usages: filetags --tags="presentation projectA" .pptx … adds the tags "presentation" and "projectA" to all PPTX-files filetags --tags="presentation -projectA" .pptx … adds the tag "presentation" to and removes tag "projectA" from all PPTX-files filetags -i … ask for tag(s) and add them to all files in current folder filetags -r draft report* … removes the tag "draft" from all files containing the word "report"

This tools is looking for the optional first text file named ".filetags" in current and parent directories. Each of its lines is interpreted as a tag for tag completion. Multiple tags per line are considered mutual exclusive.

Verbose description: http://Karl-Voit.at/managing-digital-photographs/

positional arguments: FILE One or more files to tag

optional arguments: -h, --help show this help message and exit -t "STRING WITH TAGS", --tags "STRING WITH TAGS" One or more tags (in quotes, separated by spaces) to add/remove --remove Remove tags from (instead of adding to) file name(s) -i, --interactive Interactive mode: ask for (a)dding or (r)emoving and name of tag(s) -R, --recursive Recursively go through the current directory and all of its subdirectories. Implemented for --tag-gardening and --tagtrees -s, --dryrun Enable dryrun mode: just simulate what would happen, do not modify files --hardlinks Use hard links instead of symbolic links. This is ignored on Windows systems. Note that renaming link originals when tagging does not work with hardlinks. -f, --filter Ask for list of tags and generate links in "$HOME/.filetags_tagfilter" containing links to all files with matching tags and start the filebrowser. Target directory can be overridden by --tagtrees-dir. --filebrowser PATH_TO_FILEBROWSER Use this option to override the tool to view/manage files (for --filter; default: geeqie). Use "none" to omit the default one. --tagtrees This generates nested directories in "$HOME/.filetags_tagfilter" for each combination of tags up to a limit of 2. Target directory can be overridden by --tagtrees-dir. Please note that this may take long since it relates exponentially to the number of tags involved. Can be combined with --filter. See also http://Karl-Voit.at/tagstore/ and http://Karl-Voit.at/tagstore/downloads/Voit2012b.pdf --tagtrees-handle-no-tag "treeroot" | "ignore" | "FOLDERNAME" When tagtrees are created, this parameter defines how to handle items that got no tag at all. The value "treeroot" is the default behavior: items without a tag are linked to the tagtrees root. The value "ignore" will not link any non-tagged items at all. Any other value is interpreted as a folder name within the tagreees which is used to link all non-tagged items to. --tagtrees-link-missing-mutual-tagged-items When the controlled vocabulary holds mutual exclusive tags (multiple tags in one line) this option generates directories in the tagtrees root that hold links to items that have no single tag from those mutual exclusive sets. For example, when "draft final" is defined in the vocabulary, all items without "draft" and "final" are linked to the "no-draft-final" directory. --tagtrees-dir When tagtrees are created, this parameter overrides the default target directory "$HOME/.filetags_tagfilter" with a user-defined one. It has to be an empty directory or a non-existing directory which will be created. This also overrides the default directory for --filter. --tagtrees-depth TAGTREES_DEPTH When tagtrees are created, this parameter defines the level of depth of the tagtree hierarchy. The default value is 2. Please note that increasing the depth increases the number of links exponentially. Especially when running Windows (using lnk-files instead of symbolic links) the performance is really slow. Choose wisely. --ln, --list-tags-by-number List all file-tags sorted by their number of use --la, --list-tags-by-alphabet List all file-tags sorted by their name --lu, --list-tags-unknown-to-vocabulary List all file-tags which are found in file names but are not part of .filetags --tag-gardening This is for getting an overview on tags that might require to be renamed (typos, singular/plural, ...). See also http://www.webology.org/2008/v5n3/a58.html -v, --verbose Enable verbose mode -q, --quiet Enable quiet mode --version Display version and exit

:copyright: (c) by Karl Voit tools@Karl-Voit.at :license: GPL v3 or any later version :URL: https://github.com/novoid/filetags :bugreports: via github or tools@Karl-Voit.at :version: 2018-08-02 ·

+END_src

*** Examples:

: filetags --tags foo a_file_name.txt ... adds tag "foo" such that it results in ~a_file_name -- foo.txt~

: filetags -i *.jpeg ... interactive mode: asking for list of tags (for the JPEG files) from the user

: filetags --tags "foo bar" "file name 1.jpg" "file name 2 -- foo.txt" "file name 3 -- bar.csv" ... adds tag "foo" such that it results in ... : "file name 1 -- foo bar.jpg" : "file name 2 -- foo bar.txt" : "file name 3 -- bar foo.csv"

: filetags --remove --tags foo "foo a_file_name -- foo.txt" ... removes tag "foo" such that it results in ~foo a_file_name.txt~

: filetags --tag-gardening ... prints out a summary of tags in current and sub-folders used and tags that are most likely typos or abandoned

For =--filter= and =--tagtrees= examples see sections below.

Independent to tags you might define on the fly, the optional file .filetags stores a controlled vocabulary of recurrent tags; adjust this content to your needs. In an interactive session, this set is available to tag any file in the folder .filetags resides (click tab key) and propagates into folders of lower hierachy.

** Changelog

** Get the most out of filetags: controlled vocabulary ~.filetags~ :PROPERTIES: :ID: 2018-07-08-cv :CREATED: [2015-01-02 Fri 17:12] :END:

This awesome tool is providing support for [[https://en.wikipedia.org/wiki/Controlled_vocabulary][controlled vocabularies]]. When invoked for interactive tagging, it is looking for files named ~.filetags~ in the current working directory and its parent directories as well. The first file of this name found is read in. Each line represents one tag. Those tags are used for tag completion.

This is purely great: with tags within ~.filetags~ you don't have to enter the tags entrirely: just type the first characters and press =TAB= (twice to show you all possibilities). You will be amazed how efficiently you are going to tag things! :-)

Of course, you can remove existing tags by prepending a =-= character to the tag: =-tagname=. This also works interactively using the tab completion feature.

You can use comments in =.filetags= files: everything after a =#= character is considered a comment. You can even add a comment after a tag like "=mytag # this is a test tag=".

If you do use tags you do not want to get proposed for tagging, you can write them in lines like the following ones to omit their proposal (case insensitive):

: #donotsuggest omit-this-tag dontshow : #donotsuggest wontpropose

** Mutually exclusive tags :PROPERTIES: :ID: 2018-07-08-mutually-exclusive-tags :END:

If you enter multiple tags in the same line in ~.filetags~, they are interpreted as mutually exclusive tags. For example, if your ~.filetags~ contains the line ~winter spring summer autumn~, filetags replaces any season-tag with the new one. So if you tag the file …

: example file -- summer anothertag.txt

… with the tag ~winter~, it gets renamed to …

: example file -- winter anothertag.txt

… without having to manually remove the tag ~summer~.

Common mutually exclusive tags are =draft final= or =confidential internal public=.

** Filter :PROPERTIES: :CREATED: [2018-08-01 Wed 11:44] :END:

Consider you have a directory that contains hundreds of files.

If you want to retrieve a file whose tags you know, you can skim through all the files. However, filetags offers you a more elegant possibility: you can filter the files according to one or more tags.

For example, we take a look at following situation:

: $HOME/my party/ : | 2018-06-25 Party invitation -- scan correspondence.pdf : | 2018-07-31 Guest list -- correspondence.txt : | 2018-08-01T11.51.44 Uncle Bob arrives.jpg : | 2018-08-01T12.31.42 Sheila with her new boyfriend -- friends.jpg : | 2018-08-01T14.12.23 Start of BBQ with the big steak.jpg : | ... : | 2018-08-01T23.53.19 Even uncle Bob desides to go home -- fun.jpg : | 2018-08-05 Lessons learned for planning a party -- scan.pdf : | 2018-08-06 Thank-you letter Bob -- scan.pdf : | Bills/ : | 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf : | 2018-08-03 Bill of the butcher -- scan taxes.pdf

Following command and interaction would generate following temporal link structure:

: filetags --filter

User gets asked to enter one or more tags and she enters "scan". What now happens is that filetags creates a directory whose content consists of links to all matching files from your query. By default, the resulting directory is =.filetags_tagfilter= in your home directory. After invoking for our example, the content of this retrieval directory looks like that:

: $HOME/.filetagstagfilter/ : | 2018-06-25 Party invitation -- scan correspondence.pdf : | 2018-08-05 Lessons learned for planning a party -- scan.pdf : | 2018-08-06 Thank-you letter Bob -- scan.pdf

This way, our user is quickly able to skim through all scanned documents to locate the one desired to retrieve.

To locate all matching files in all sub-directories as well, the user is able to add the parameter =--recursive= ...

: filetags --filter --recursive

... and chooses to enter the tag "scan" which would generate following temporal link structure:

: $HOME/.filetagstagfilter/ : | 2018-06-25 Party invitation -- scan correspondence.pdf : | 2018-08-05 Lessons learned for planning a party -- scan.pdf : | 2018-08-06 Thank-you letter Bob -- scan.pdf : | 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf : | 2018-08-03 Bill of the butcher -- scan taxes.pdf

** TagTrees :PROPERTIES: :ID: 2018-07-08-tagtrees :END:

This functions is somewhat sophisticated as it is not a very well-known thing to have. If you're really interested in the whole story behind the visualization/navigation of tags using TagTrees, feel free to read [[http://Karl-Voit.at/tagstore/downloads/Voit2012b.pdf][my PhD thesis]] about it on [[http://Karl-Voit.at/tagstore/][the tagstore webpage]]. It is surely a piece of work I am proud of and the general chapters of it are written so that the average person is perfectly well able to follow.

In short: this function takes the files of the current directory and generates hierarchies up to level of =$maxdepth= (by default 2, can be overridden via =--tagtrees-depth=) of all combinations of tags, [[https://en.wikipedia.org/wiki/Symbolic_link][linking]] all files according to their tags.

Too complicated? Then let's explain it with some examples.

Consider having a file like:

: My new car -- car hardware expensive.jpg

Now you generate the TagTrees, you'll find [[https://en.wikipedia.org/wiki/Symbolic_link][links]] to this file within sub-directories of =~/.filetags=, the default target directory: =car/= and =hardware/= and =expensive/= and =car/hardware/= and =car/expensive/= and =hardware/car/= and so on. You get the idea.

The default target directory can be overridden via =--tagtrees-dir=.

Therefore, within the folder =new/expensive/= you will find all files that have at least the tags "new" and "expensive" in any order. This is /really/ cool to have.

Files of the current directory that don't have any tag at all, are linked directly to =~/.filetags= so that you can find and tag them easily.

I personally, do use this feature within my image viewer of choice ([[http://geeqie.sourceforge.net/][geeqie]]). I mapped it to =Alt-T= because =Alt-t= is occupied by =filetags= for tagging of course. So when I am within my image viewer and I press =Alt-T=, TagTrees of the currently shown images are created. Then an additional image viewer window opens up for me, showing the resulting TagTrees. This way, I can quickly navigate through the tag combinations to easily interactively filter according to tags.

Please note: when you are tagging linked files within the TagTrees with filetags, only the current link gets updated with the new name. All other links to this modified filename within the other directories of the TagTrees gets broken. You have to re-create the TagTrees to update all the links after tagging files.

The option =--tagtrees-handle-no-tag= controls how files with no tags should be handled. When set to =treeroot=, untagged files are linked in the TagTrees target directory directly. The option =ignore= does not link them at all. The option =FOLDERNAME= links them to a directory named accordingly to the value which is a sub-directory of the TagTrees target directory.

With the option =--tagtrees-link-missing-mutual-tagged-items= you can control, whether or not there will be an additional TagTrees folder that contains all files which lack one of the mutually exclusive tags. Using the example ~winter spring summer autumn~ from above, all files that got none of those four tags get linked to a TagTrees directory named "no_winter_spring_summer_autumn". This way, you can easily find and tag files that don't participate in this set of mutually exclusive tags.

Using the example files from above:

: $HOME/my party/ : | 2018-06-25 Party invitation -- scan correspondence.pdf : | 2018-07-31 Guest list -- correspondence.txt : | 2018-08-01T11.51.44 Uncle Bob arrives.jpg : | 2018-08-01T12.31.42 Sheila with her new boyfriend -- friends.jpg : | 2018-08-01T14.12.23 Start of BBQ with the big steak.jpg : | ... : | 2018-08-01T23.53.19 Even uncle Bob desides to go home -- fun.jpg : | 2018-08-05 Lessons learned for planning a party -- scan.pdf : | 2018-08-06 Thank-you letter Bob -- scan.pdf : | Bills/ : | 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf : | 2018-08-03 Bill of the butcher -- scan taxes.pdf

... and the command line ...

: filetags --tagtrees --tagtrees-handle-no-tag "has_no_tag" --tagtrees-depth 2 --recursive

... filetags generates the temporal link structure:

: $HOME/.filetagstagfilter/ : | scan/ : | 2018-06-25 Party invitation -- scan correspondence.pdf : | 2018-08-05 Lessons learned for planning a party -- scan.pdf : | 2018-08-06 Thank-you letter Bob -- scan.pdf : | 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf : | 2018-08-03 Bill of the butcher -- scan taxes.pdf : | correspondence/ : | 2018-06-25 Party invitation -- scan correspondence.pdf : | taxes/ : | 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf : | 2018-08-03 Bill of the butcher -- scan taxes.pdf : | correspondence/ : | 2018-06-25 Party invitation -- scan correspondence.pdf : | 2018-07-31 Guest list -- correspondence.txt : | scan/ : | 2018-06-25 Party invitation -- scan correspondence.pdf : | friends/ : | 2018-08-01T12.31.42 Sheila with her new boyfriend -- friends.jpg : | fun/ : | 2018-08-01T23.53.19 Even uncle Bob desides to go home -- fun.jpg : | taxes/ : | 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf : | 2018-08-03 Bill of the butcher -- scan taxes.pdf : | scan/ : | 2018-07-30 Beverages by FreshYouUp -- scan taxes.pdf : | 2018-08-03 Bill of the butcher -- scan taxes.pdf : | has_notag/ : | 2018-08-01T11.51.44 Uncle Bob arrives.jpg : | 2018-08-01T14.12.23 Start of BBQ with the big steak.jpg : | ...

This looks complicated because there are many links generated the user does not really need. The beauty of this solution is that the user is able to navigate to a file using a wide set of different paths (the TagTrees) and she is able to choose the one path that suits the current cognitive model.

For example, she might want to retrieve "the one document from the last party which she remembers of having scanned and which she used for the invitation correspondence". With this mind-set, she most likely retrieves the document via =$HOME/.filetags_tagfilter/scan/correspondence/= or =$HOME/.filetags_tagfilter/correspondence/scan/= (does not matter which).

The large number of other TagTrees can be ignored for this retrieval task.

Another retrieval task example would be "all photos that do have no tag in order to continue tagging the photos". In this example, the user visits =$HOME/.filetags_tagfilter/has_no_tag/=, fires her image viewer (which has filetags integrated already - see below) and continues with the tagging activity. Since filetags synchronizes the tags within TagTrees linked files and the original files, the original files get renamed accordingly.

** Bonus: Using tags to specify a sub-set of photographs :PROPERTIES: :ID: 2018-07-08-sel-photos :END:

You know the problem: got back from Paris and you can not show 937 image files to your friends. It's just too much.

My solution: I tag to define selections. For example, I am using ~sel~ ("selection") for the ultimate cool photographs using ~filetags~, of course.

Within geeqie, which is my preferred image viewer, I redefined ~F~ to call filetags with its =--filter= parameter. Now I get asked to enter one or more tags to filter the current folder. For presenting only the files that were tagged with ~sel~, I enter ~sel~ and confirm with ~Enter~.

This creates a temporary folder with symbolic links to all photographs of the current folder that contain the tag ~sel~ and it starts a new (additional) instance of geeqie.

In short: after returning from a trip, I mark all "cool" photographs within geeqie, choose ~t~ and tag them with ~sel~ (described in previous section). For showing only ~sel~ images, I just press ~F~, enter ~sel~ and instead of 937 photographs, my friends just have to watch the best 50 or so. :-)

Watch [[https://media.ccc.de/v/GLT18_-_321_-_en_-_g_ap147_004_-_201804281550_-_the_advantages_of_file_name_conventions_and_tagging_-_karl_voit][this 45 minute talk]] on how I am using this (and other) features.

If your system has Python 3 installed, you can start using filetags right away in any command line environment.

However, users do want to integrate tools like filetags also in various GUI tools.

The [[file:Integration.org][Integration.org file]] explains integration in some tools that allow external commands being added:

If you have integrated filetags in additional commonly used tools, please send me a short how-to so that others are able to get the most out of filetags as well.

This tool is part of a tool-set which I use to manage my digital files such as photographs. My work-flows are described in [[http://karl-voit.at/managing-digital-photographs/][this blog posting]] you might like to read and in the video which is linked above.

In short:

For tagging, please refer to [[https://github.com/novoid/filetags][filetags]] and its documentation.

See [[https://github.com/novoid/date2name][date2name]] for easily adding ISO time-stamps or date-stamps to files.

For easily naming and tagging files within file browsers that allow integration of external tools, see [[https://github.com/novoid/appendfilename][appendfilename]] (once more) and [[https://github.com/novoid/filetags][filetags]].

Moving to the archive folders is done using [[https://github.com/novoid/move2archive][move2archive]].

Having tagged photographs gives you many advantages. For example, I automatically [[https://github.com/novoid/set_desktop_background_according_to_season][choose my desktop background image according to the current season]].

Files containing an ISO time/date-stamp gets indexed by the filename-module of [[https://github.com/novoid/Memacs][Memacs]].


I'm glad you like my tools. If you want to support me:

This section is an exhaustive list of features of =filetags=. You might skip this when you're a first-time user in order not to get irritated for simple use-cases only.

This section is particularily helpful for re-implementing =filetags= functionality and for power-users which are interested in the advanced functions provided by this tool.

** General

| Before | When | After | Note | |----------------------------------+--------------------+----------------------------------+--------------------------------------------| | =Some file name.jpeg= | tagging with =foo= | =Some file name -- foo.jpeg= | Tag separator is added automatically | | =Some file name= | tagging with =foo= | =Some file name -- foo= | There is no need for a file extension | | =Some file name -- foo.jpeg= | tagging with =bar= | =Some file name -- foo bar.jpeg= | =bar= becomes last tag | | =Some file name.jpeg.lnk= | tagging with =bar= | =Some file name -- bar.jpeg.lnk= | The =.lnk= extension is taken into account | | =Some file name -- bar.jpeg= | untagging =bar= | =Some file name.jpeg= | Tag separator is removed | | =Some file name -- foo bar.jpeg= | untagging =foo= | =Some file name -- bar.jpeg= | Tag order stays same when removing |

** Interactive Mode

** Controlled Vocabulary (CV)

Please read [[id:2018-07-08-cv][this]] first in order to understand CVs.

** Filter

This function is very handy for filtering groups of photographs within a large set of photographs as described [[id:2018-07-08-sel-photos][here]].

** Features Related to TagTrees

[[id:2018-07-08-tagtrees][The TagTrees concept]] was developed by me during my PhD thesis ([[http://Karl-Voit.at/tagstore/downloads/Voit2012b.pdf][PDF]]) when developing with the [[http://Karl-Voit.at/tagstore/][tagstore research platform]].

Please note that in future, all functions related to TagTrees will be moved into a separate tool named =tagtrees=.

** Tag Gardening

Just invoke =filetags --tag-gardening= or =filetags --recursive --tag-gardening= and read its output to learn about helpful analysis results to curate your tags. My personal favorites are:

This feature is really powerful when it comes to maintenance of your file tags or get some insight related to your tagging patterns.