oniony / TMSU

TMSU lets you tags your files and then access them through a nifty virtual filesystem from any other application.
Other
2.05k stars 120 forks source link

Tag grouping AKA taggable objects #22

Open 0ion9 opened 9 years ago

0ion9 commented 9 years ago

To be quite clear, this issue has nothing to do with the upcoming release of 0.5.0. I was thinking about implementing this myself as an extension, and then started thinking about whether it might belong in core TMSU.

The basic idea is that many kinds of files which may be tagged can usefully be said to have 'groups' of tags.

For example, supposing you have a photo collection including, among other things, photos of Alice and Bob. Alice and Bob are often seen together in these photos. Alice doesn't wear hats as a rule, but Bob does. Being able to group tags, like {alice hat} and do group queries '{alice hat}`, allows you to find photos where -Alice- is wearing a hat, as opposed to (a few pictures with Alice wearing a hat, hidden within hundreds with Alice and Bob being both in the picture, and Bob's usual hat being in the picture); The logical grouping {alice hat} is more useful in this case than the undifferentiated 'alice bob hat'

Hopefully, the mechanism described above is quite simple and obvious: each file has an arbitrary number of groups, identified only by ID number, and each group has an arbitrary number of tags. One might propose that 0 is a valid group id, which can be logically thought of as 'anything else' (that is not otherwise in a logical grouping. For example, the background in photos is usually not going to be worth grouping tags in, it might reasonably be seen as an undifferentiated set of tags like 'beach sand palm_tree tree water sea plane'.

One complication that may not immediately be obvious with the above is that you may want a tagging to participate in multiple groups -- certainly, there is no guarantee that only one person in a photo is going to wear a hat.

The obvious implementation, AFAICS, would expand the file_tag table to have four fields and use all four as a compound PRIMARY KEY, similar to the current implementation with tag values.

The non-obvious implementation might separate tag groups into a separate (group_id, tag_id, value_id) table, with each unique set of tag=values occurring exactly once, and then have file_tag be keyed only by (file_id, group_id). This is rather more radical, I mention it mainly because I think it has better performance characteristics for querying.

The most complicated part would be querying; specifically, querying at an acceptable speed. The mentioned table refactoring is the only way I have seen so far to minimize use of recursive SQL queries -- it would allow finding the groups which match the query (recursively) in the most likely much smaller (group_id, tag_id, value_id) table, and then non-recursively finding the (file_id, group_id) tuples which reference one of those group_ids. However, this would make normal ungrouped queries more complex to perform. (It also probably wouldn't be that fast for cases where many groups match the query)

There are significant technical challenges involved, however I felt I should propose the idea so that it can at least be considered.

oniony commented 9 years ago

Hi there,

Thanks for to suggestion. Heaving to ability to identity objects (including people) or subjects within a file and apply tags to these is something I have considered before. I'll add it to the medium term TODO list but I haven't put much thought into its implementation.

I would think that object classification would be preferable over a more abstrach tag grouping but I could be wrong. Maybe others have views on this.

Thanks, Paul On 30 Jan 2015 23:19, "0ion9" notifications@github.com wrote:

To be quite clear, this issue has nothing to do with the upcoming release of 0.5.0. I was thinking about implementing this myself as an extension, and then started thinking about whether it might belong in core TMSU.

The basic idea is that many kinds of files which may be tagged can usefully be said to have 'groups' of tags.

For example, supposing you have a photo collection including, among other things, photos of Alice and Bob. Alice and Bob are often seen together in these photos. Alice doesn't wear hats as a rule, but Bob does. Being able to group tags, like {alice hat} and do group queries '{alice hat}`, allows you to find photos where -Alice- is wearing a hat, as opposed to (a few pictures with Alice wearing a hat, hidden within hundreds with Alice and Bob being both in the picture, and Bob's usual hat being in the picture); The logical grouping {alice hat} is more useful in this case than the undifferentiated 'alice bob hat'

Hopefully, the mechanism described above is quite simple and obvious: each file has an arbitrary number of groups, identified only by ID number, and each group has an arbitrary number of tags. One might propose that 0 is a valid group id, which can be logically thought of as 'anything else' (that is not otherwise in a logical grouping. For example, the background in photos is usually not going to be worth grouping tags in, it might reasonably be seen as an undifferentiated set of tags like 'beach sand palm_tree tree water sea plane'.

One complication that may not immediately be obvious with the above is that you may want a tagging to participate in multiple groups -- certainly, there is no guarantee that only one person in a photo is going to wear a hat.

The obvious implementation, AFAICS, would expand the file_tag table to have four fields and use all four as a compound PRIMARY KEY, similar to the current implementation with tag values.

The non-obvious implementation might separate tag groups into a separate (group_id, tag_id, value_id) table, with each unique set of tag=values occurring exactly once, and then have file_tag be keyed only by (file_id, group_id). This is rather more radical, I mention it mainly because I think it has better performance characteristics for querying.

The most complicated part would be querying; specifically, querying at an acceptable speed. The mentioned table refactoring is the only way I have seen so far to minimize use of recursive SQL queries -- it would allow finding the groups which match the query (recursively) in the most likely much smaller (group_id, tag_id, value_id) table, and then non-recursively finding the (file_id, group_id) tuples which reference one of those group_ids. However, this would make normal ungrouped queries more complex to perform. (It also probably wouldn't be that fast for cases where many groups match the query)

There are significant technical challenges involved, however I felt I should propose the idea so that it can at least be considered.

— Reply to this email directly or view it on GitHub https://github.com/oniony/TMSU/issues/22.

0ion9 commented 9 years ago

Hi, as I began implementing annotations seriously, I did find that I wanted to tag regions (image regions, pages, audio/video regions) of files moreso than files themselves; I took the general model of attaching an annotation to a (x,y,z,w,h,d) tuple that represented an 'area of content' within the file.

I have begun looking into what is needed to effectively select regions from a file; I'd suggest that at minimum it needs some support from the viewer or editor. I've implemented a ROI selector for sxiv image viewer, but it is currently buggy (initial selection of ROI is unreliable/weird); anyway it represents what I think is the minimum: you need to be able to select a region interactively and package it up easily into an x,y,z,w,h,d tuple. I think at minimum people would want to do this with image, video, and sound; i've only looked into the former so far.

Hopefully the above is on-target with what you meant about object identity/classification. I am not yet satisfied that object classification is a do-all, unless, I suppose, you include the possibility to simply annotate the whole file.

oniony commented 9 years ago

Yes, certainly, some way of identifying an object within the media, be it a person in an image or artist in a song, scene in a video, &c.

In will need some careful thought as otherwise the facility could end up quite academic: something that works but is not utilitarian she would remain largely unused.

0ion9 commented 9 years ago

I've done two iterations on this idea now. I'm reasonably happy with my current UI ideas, though I'm pretty sure it can still be improved.

Basic concept: use -1, -2 -3.. to identify groups. Also reserve the tag names '[' ']' -- this allows groups in the output of tmsu tags to be easily parsed by human eye, example: some_ungrouped_tag another_ungrouped_tag [ foo bar baz ] [ bar bim ], or the slightly more verbose and explicit some_ungrouped_tag another_ungrouped_tag -1 [ foo bar baz ] -2 [ bar bim ] . If the user wishes to attach other data, like ROIs (cf. tmsoup.annotation module), to the group, they can do so by referencing (file_id, group_id) tuples in their own table.

Shell interaction: No known problems with shell interaction -- all tested shells (dash, bash, zsh) treat a bare [ or ] as a literal, not a pattern.

UI Specifics:

Issues:

I have done some preliminary testing, 'faking' groups by using tag values, like FILE1 : a=1 b=1 c=1 a=2 c=2 d=2 d=3 f=3 (personally, using tag values would actually be suitable, for me, since any key=value pairs I want to attach on a file are generally unrelated to object identity, like year=2014 or score=3)

The query I used was this:

SELECT file_id FROM (SELECT * FROM file_tag WHERE tag_id IN (1,2)) GROUP BY file_id,value_id HAVING count(*) == 2;

Which finds file_id, "group" pairs that contain both tag 1 and tag 2. I've tested with a range of values and this seems reliable. I am aware though that it is not obviously generalizable to expressions that cannot be determined by count, like a AND ((b OR c) OR (d OR e)) (the 'required item' a's presence cannot be verified via count), or a NOT c NOT d (the disallowed items c d cannot be verified via count). Then again, I'm really not that experienced with the use of HAVING or the dreaded WITH ;)

I'm still experimenting to figure out how to express the full range of TMSU's querying syntax in relation to groups. That said, I believe that by far the most common queries will be of the types [ a b c ] and [ a b NOT c NOT d ]; the former case is already understood (per explanation above), I'm working on the latter case.

I look forward to your comments (though I don't expect them soon, I'm well aware this is a large slab of text :)

oniony commented 9 years ago

On 4 February 2015 at 04:53, 0ion9 notifications@github.com wrote:

Basic concept: use -1, -2 -3.. to identify groups.

I assume we are still talking about the idea of tagging components of a piece of media, for example objects or people in a scene?

The term 'group' is a bit confusing because you could conceptually have a group of tags that could be applied to files (though the 'imply' funcitonality would make such a feature somewhat pointless). So let's call the items being tagged objects (they may be people, or instruments or scenes).

some_ungrouped_tag another_ungrouped_tag [ foo bar baz ] [ bar bim ], or the slightly more verbose and explicit some_ungrouped_tag another_ungrouped_tag -1 [ foo bar baz ] -2 [ bar bim ]

So are you saying that if we applied this, we would tag our photo, video, song (or whatever) with the regular tags 'some_ungroup_tag' and 'another_ungrouped_tag' but then tag object 1 with the set of tags 'foo', 'bar' and 'baz', and object 2 with 'bar' and 'bim'? How do we know what object 1 and 2 refer to or do we rely upon the tags within the group to do this, e.g. having a 'name' tag inside the group for people, for example? My question is, are you expecting a tagged object to be self explanatory? I am wondering whether it needs to be formally identified or not.

  - -1, -2, -3 etc. are permitted. If they refer to an existing tag
  group id for that file, the tags are added to that group, eg -1 [ a
  b c ] adds the tags [ a b c] to group 1. If the identified group
  does not exist, the tags are added to a new group.

I think your examples are intuitive in the case of tagging objects (and implicitly defining the objects) but the above syntax for adding tags to an object is a bit clunky. There would have to be some way of listing the objects in a file in order to work out which object the user is dealing with. Again this comes back to object identification. It's almost is if you would need some kind of 'where' syntax to apply tags to the object that meets some conditions:

tmsu path/to/file --object-where="name = bob" hat sunglasses

But that is also quite clunky.

  - identifying groups by query is permitted, as in -a,b,c [ d f ].
  (exploiting the disallowability of - in tag names to the max here ;). IMO
  the semantic meaning is fairly clear -- find the group with these tags, add
  (or remove) these other tags to it. Again, if the group doesn't exist, a
  new group is created. Another example : "-(a or b) c d" [ g h ]

On second read, I think what you are suggesting here is the same as what I just meant by --object-where, albeit with a syntax that I think is quite unintuitive.

Another possibility to consider is partitioning of the file into objects by (ab)using the path:

tmsu tag /home/paul/scene.jpg country=uk landscape
tmsu tag /home/paul/scene.jpg#bob hat sunglasses
tmsu tag /home/paul/scene.jpg#sally smiling

(An alternative would be to use / instead of #, treating the file as just another element in the path to the object.)

This approach, I think is more intuitive but obviously has the limitation that only the file or a single object can be tagged at a time. So, to address that, there would have to be an alternative syntax to allow the file and its objects to be tagged on the same command-line:

tmsu tag /home/paul/scene.jpg country=uk landscape --object=bob hat sunglasses --object=sally smiling

This is similar to your earlier examples, however. But the advantage is the clunky modification of an object's tags is simplified:

tmsu tag /home/paul/scene.jpg#bob happy

So, in conclusion, I think your idea is a good one. Object classification is something I thought about years ago but dismissed as making the problem too complicated. But, as you have presented it here, it doesn't seem so bad now. Certainly there are some wrinkles that need attention but it is appealing. I'll add it to the TODO list.

Thanks, Paul

oniony commented 9 years ago

Sorry, the quotes went a bit wrong in that last message. I'll edit the issue so it might be better to read it online than in your email.

0ion9 commented 9 years ago

Yes, in general I had considered that an object would usually be identifiable by its tagset; in cases where this is not true, AFAICS it would usually be desirable to address objects by id number, as this is the usual approach to object instancing.

Personally, I find the prospect of naming each object tedious. This may be because I am more focused on a GUI type of application (using dmenu or lighthouse as a wrapper around TMSU tagging) and so have the presumption that selecting a group is easy. OTOH, it seems that ZSH completion is a comparable tool to this (describe -1 with a list of its member tags beside it, -2 ditto...)

The -a,b,c syntax was a logical progression from -1 style type syntax that identifies an object by id, but when I think about it I agree it is not that intuitive to pick up.

There would have to be some way of listing the objects in a file in order to work out which object the user is dealing with.

In the proposal I wrote, this is handled by tmsu tags (implicitly by output order, or explicitly, -1 etc). With your proposed adjustments, it seems that you would have to include -obob , etc, in the output of tmsu tags. (I say this because I have in mind that the output of tmsu tags should consist entirely of valid arguments to tmsu tag/untag. I'm not sure if you agree about that but I think it's a nice property to aim for.)

You write: tmsu tag /home/paul/scene.jpg country=uk landscape --object=bob hat sunglasses --object=sally smiling

Presumably abbreviatable to tmsu tag /home/paul/scene.jpg country=uk landscape -obob hat sunglasses -osally smiling ? or tmsu tag --tags 'country=uk landscape -obob hat sunglasses -osally smiling' /home/paul/scene.jpg (i guess so, just wanted to try it out with the different possible ways it could appear.. still seems reasonably intuitive no matter which formulation you use, sounds good.)

Hm, it just occurred to me that we probably have different preconceptions about the function of this. For example, I proposed numeric ids because transferability between pictures (ie. "true" object identity) is not significant to me (and so I was happy to optimize for storage efficiency); but after thinking about what you have written, I suspect that transferability between pictures is significant to you: you would want to be able to ask for any files containing a group named bob, perhaps, with the assumption that that is all the same Bob, not a variety of Bobs, and certainly not some person Bobs and some ..hair bobs ;)