darktable-org / lua-scripts

155 stars 110 forks source link

Don't delete unused tags in path of attached ones #326

Open quovadit opened 3 years ago

quovadit commented 3 years ago

With PR 8270 you can convert nodes of hierarchical tags to new tags. As no images are attached to this new tag directly, the lua script delete_unused_tags would delete this tag again.

I would like to adapt the lua-script so that non attached tags in the path of an attached one are not deleted

Is it ok to edit the existing script?

I yes - I will open a Pull Request

phweyland commented 3 years ago

@wpferguson we may need you help here. This is an old issue. The tags on the path of a leaf tag (reference to the tree view) are part of the tags of an image attached to the leaf tag at export time. Example:

places|AT
places|AT|Wien

If Wien is attached to an image, then this image can inherit of 2 tags at export time: AT and Wien. Here is the issue with delete_unused_tags which has ... no pity for AT.

I agree with the need to have a mean to have a tag cleanup feature which does not delete the tags without attached images but with a child tag having an image attached.

What are your thoughts ? TIA.

phweyland commented 3 years ago

A bit of context to explain why we need the above.

Two years ago with the tag places|AT|Wien attached to an image we had the following choices for exporting:

In that context "delete_unused_tags" worked perfectly.

Now there are other features linked to the tag object: synonyms, privacy, category, images ordering.

In the tag places|AT|Wien, AT is a node in the hierarchical representation of the tag but is not a tag itself. It is a property of places|AT|Wien and has no explicit attribute except its name (node name). It has an implicit attribute, category (this is a design choice), and as such not exportable. This is perfectly convenient for places but not for AT in our example. To turn the name AT exportable the user needs to create a tag places|AT. All the tag names (AT) on the path of an attached tag (places|AT|Wien) are then exported together with this attached tag name (Wien), if "omit_tag_hierarchy" is not set. EDIT if "omit_tag_hierarchy" is not set all nodes are exported for backward compatibility.

Then, to conform with this tag exportation behavior, the tag cleaning should have the possibility/option to consider places|AT as used if places|AT|Wien is attached without being itself attached (which would be also a solution but a very boring one).

Does that clarifies the goal ?

wpferguson commented 3 years ago

If I understand correctly the goal is to export the tags AT and Wein, without also exporting places?

phweyland commented 3 years ago

If I understand correctly the goal is to export the tags AT and Wein, without also exporting places?

No, export tags already works like that.

The goal here is to have a clean up script which doesn't delete the tags which are on the path of the used tags.

wpferguson commented 3 years ago

What I don't understand, and don't seem to be smart enough to figure out, is why we are making AT a tag. What does making AT a tag provide us that we don't already have? If the only goal is to make the tag AT export in a certain case, then why don't we just add an export flag to the flags and check it for that certain case? Or does making AT a tag provide a capability that I'm not aware of?

phweyland commented 3 years ago

What does making AT a tag provide us that we don't already have? If the only goal is to make the tag AT export in a certain case, then why don't we just add an export flag to the flags and check it for that certain case? Or does making AT a tag provide a capability that I'm not aware of?

There is no way to add a property (exportable or not) to AT as long it is not a tag. So by default a "node only" is not exportable. Which is OK for places which doesn't bring any value to the image in that case, but NOK for AT which is a characteristic of the image.

wpferguson commented 3 years ago

If I export an image with the tag Sports|Basketball|Men with just tags and omit_hierarchy selected in the export box, then run exiftool against it I see the tags Sports, Basketball, Mens. In what case does AT not export?

I'm finally starting to see (after my third cup of coffee :smile:). In my flag it scenario, it only exists as part of something larger so you can't tag it by itself, thus you have to make it a tag. Or, do we flag it as something else, since it's not being used to tag something.

quovadit commented 3 years ago

@wpferguson : you are right, the goal of 8270 is to export AT and Wien without exporting places

And as @phweyland explained in detail, the goal of this issue 326 is not delete the created tag places|AT

wpferguson commented 3 years ago

Got it. Sorry I was so dense. I had it in my head that you could manipulate a part of the path. Next time I'm being dense tell me to go drink some coffee. 😄

quovadit commented 3 years ago

so can I open a PR to update the existing script delete_unused_tags ? (with the new function-name of course)

wpferguson commented 3 years ago

You'll need to wrap the test in an API version check since this wont show up in darktable until 3.6. If anyone downloads/updates their scripts before 3.6 is out then the script would fail. The current API is 6.2.2, but there is a PR submitted that will bump it to 6.2.3.

quovadit commented 3 years ago

I did exception-handling with pcall instead...

phweyland commented 3 years ago

If we agree on the goal, it is time to discuss about what we do. @quovadit, what's your idea ? As @wpferguson said, we don't want that other users complain that delete_unused_tags doesn't remove all unattached tags...

quovadit commented 3 years ago

I would only delete a tag, if get_reference_count == 0

I can't imagine that there are complaints about this, because the standard-usecase is not affected

phweyland commented 3 years ago

I would not take the risk. At least a lua preference could help to keep everyone satisfied.

Then, coded as you have done in tags.c # 8270, you access the db for every single tag, potentially thousand of times. This doesn't sound great to me.

quovadit commented 3 years ago

I already removed the db-access after review by @phweyland

phweyland commented 3 years ago

I already removed the db-access after review

We may not talk about the same one. I'm talking about dt_lua_tag_get_reference_count() -> dt_tag_count_tags_images() which calls the db.

Here is my suggestion:

Does that make sense ?

wpferguson commented 3 years ago

add a routine in tag.c to delete unused tags with a parameter for the choice we have discussed above: all non attached tags or only those which are not on the path of an attached one. One or two db queries should do the job.

The new way uses 3 db calls versus 1, and the calls include string matching. I did a quick test and the new way takes 4x as long. I agree with @phweyland, this probably needs to be a C function with a little db wizardry to figure out how to do it in as few calls as possible. We could add a delete_unused_tags(boolean consider_reference) then the script would just be a preference creation, preference check and a function call.

quovadit commented 3 years ago

If it will be a C function, wouldn't it be obvious to move the whole lua-script to darktable core? (preferences in tagging module)

phweyland commented 3 years ago

If it will be a C function, wouldn't it be obvious to move the whole lua-script to darktable core? (preferences in tagging module)

That's a valid question. It would be interesting to have others' standpoint. My view: this is not a core function and is well placed among the lua scripts.

wpferguson commented 3 years ago

A few thoughts

I think that delete_unused_tags.lua should remain as is and we should create a new script, delete_unreferenced_tags.lua. Wrapping both in the same script leads to a possibility of picking the wrong one with unhappy consequences for the user. Since it runs when it starts, the first run can do the damage before you get a chance to pick what you want deleted.

To me, delete unused|unreferenced tags is a database or tag maintenance function. We could add the core functions and just access them through the lua scripts until enough database maintenance functions exist to make a db maintenance module a reality. Or we could make a db maintenance module script and hash out the functions until we get it right, then move it all into core.

The other tagging thing I would like to see are batch tagging options, so that we don't have to loop over a table of images or tags and keep making single calls to the database. If I'm changing a few images then what we have works, but it doesn't scale.

Lowclouds commented 2 years ago

I'm late to the game, as well as a darktable noob who is trying to move from Lightroom to darktable, and tagging has been a main issue preventing that in the past. It looks like it should work now, but in setting up darktable - before importing any images - I tried to set up the keywords first, partially to fix issues in my Lightroom setup. I exported my Lightroom keys, edited them, and then imported them successfully, but then noticed I had some spaces where I didn't want them. I then edited then and re-imported, but now, I have hundreds of nodes with unwanted spaces that I appear to only be able to delete one-by-one from the GUI. It looks like any version of this Lua script might work, but one comment I might make is that it would be useful to have an option to keyword import to simply replace the existing set with the imported one.