rr- / szurubooru

Image board engine, Danbooru-style.
GNU General Public License v3.0
703 stars 178 forks source link

Post auto tagger idea #85

Closed kotcrab closed 8 years ago

kotcrab commented 8 years ago

I'd like to make post auto tagger for szurubooru2, I've created this project. So far it can send images to IQDB and get tags from Danbooru posts.

My idea is to periodically check for posts with specific tag (iqdb_tagme but configurable in options) and then query IQDB, get tags from Danbooru and update tags on szurubooru posts using it's rest api. It would work in the background and slowly get tags for all images to avoid hitting api limits. In case of failure (either unrecognized image or something else) it would be tagged with single different tag (iqdb_error?) and user could review those post later.

Danbooru tags can have categories, I'd handle it this way: if tag does not exist yet, apply category from Danbooru. If it already exists on szuruboou then just add it to a post.

I think it would be pretty useful. What's your option on this, does it make sense?

rr- commented 8 years ago

Sounds great - I like how it can be run separately from szurubooru's internals and communicate via its REST API, and therefore can be managed via cron jobs.

However, I have a concern regarding tag sync - what if Danbooru changes something, or szurubooru's users rename / merge / split their tags, and auto tagger mixes them up after next run? My preliminary suggestion would be to offer config with simple tag mappings ("don't pull tag X" + "rename tag X to Y" + "remap category X to Y").

On a side note, right now tag categories cannot be managed, and are placed in server config instead (both in 1.x and 2.x). Technically 2.x's tag categories are stored as strings inside Tag table. Offering to change them via the web interface, or REST, is not something I have thought about. I validate categories with server config because of colors (which I imagined to hardcode in server config) and to offer something in the category combobox in the tag edit form. Maybe I could get rid of them from config altogether, offer category coloring solely via user CSS (so that admins can add [data-tag=copyright] { color: red }), and let the (privileged) users type whatever in the tag category? I'd like to know what @tehoko thinks about this, too.

oczki commented 8 years ago

That's a lot of after-deployment configuration to leave to the administrator. I wouldn't want to manually deal with CSS after installing szurubooru.

A more painful to implement, but in turn more user-friendly option would be to allow admins to manage tag categories via some additional page:

image

Perhaps the first one would have to be be un-removable, or restored automatically on reset. Tags that were assigned to a category that is about to be reset would need to switch to the generic one.

As for the whole idea, it sounds cool as a third-party tool, but removes some independence of your own booru. It feels better to have one's own ecosystem instead of relying on tags from other pages.

rr- commented 8 years ago

I wouldn't want to manually deal with CSS after installing szurubooru.

That's understandable. So, judging from the mockup, you want the tag categories to be a set of fixed values, rather than freeform strings. I prefer it that way as well.

But regarding the proposed UI itself: isn't configuring it via config file sufficient? Having a config file is a definite must - for example, you have to define database credentials somewhere, and things such as preparing user ranks really comes before logging in. There is only a handful of settings that could be tweaked from the client, and IMO, scattering parts of szurubooru's config between server and client would only add confusion.

So the real question is whether to let users "click their way" through the categories, or just shove it into the config (the cheap option, currently present in 1.x branch).

oczki commented 8 years ago

It depends if you want the moderators/administrators who are not the owners of the server (no access to its filesystem) to be able to control this.

rr- commented 8 years ago

Hm, that's a good question. I believe so. Maybe we could add it near the search input, like this?

20160418_190943_qab

oczki commented 8 years ago

Looks good.


judging from the mockup, you want the tag categories to be a set of fixed values, rather than freeform strings.

I don't think I understand. In the mockup, the white boxes are input fields, allowing the categories to be renamed. If you're talking about editing a single tag's category, then yes, the choice there would be limited to what's already set in the mocked up page (limited via dropdown or radiobox).

kotcrab commented 8 years ago

My preliminary suggestion would be to offer config with simple tag mappings ("don't pull tag X" + "rename tag X to Y" + "remap category X to Y").

I agree such configuration would be required. Also URL acquired from IQDB should be stored in some way. I was thinking about storing it in source element of szububooru post or make auto tagger use it's own datebase but that sounds like overcomplicating.

As the tags category goes: that UI looks nice but IMO tags categories aren't something that would change very often so config file might be enough. If you are going to make UI for it then I would say that only site admins should be allowed to change them. (by default, of course)

As for the whole idea, it sounds cool as a third-party tool, but removes some independence of your own booru. It feels better to have one's own ecosystem instead of relying on tags from other pages.

Depends on your use case, really. If you just want to use szurubooru to organize images you hoarded from internet then this tool would be very helpful.

rr- commented 8 years ago

@kotcrab regarding auto tagger - I think the 2.x API is pretty stable by now. Once 2.x gets released, you could just skim through git log API.md to check for any incompatible changes.

kotcrab commented 8 years ago

@rr- great, I'll try to setup local instance next week and start working on auto tagger

kotcrab commented 8 years ago

I've finished it. I also added option to pull post notes and batch uploader. You can get more details on repository page.

rr- commented 8 years ago

@kotcrab I've added tag descriptions (sort of wiki) and greatly relaxed rules regarding tag names.

I also plan to replace manual tag deleting with automatic deletion that is triggered when tag reaches 0 usages - let me know what you think.

kotcrab commented 8 years ago

Cool, I can add extracting tag wiki page.

About tag deletion, makes sense I guess. I have one concern, what if user spends time adding aliases, writing description etc. and it gets deleted without any warning. Would be pretty annoying.

Btw, source URL is not visible on post page but I guess you haven't added it yet.

rr- commented 8 years ago

Citizens of yume.pl decided to trash the source after all, speaking from their experience of never using it.

kotcrab commented 8 years ago

I see, are you going to remove source field from post resource?

rr- commented 8 years ago

I plan to keep it - if users decide to change their mind after deploy of 2.x, the decision will be easily reversible without need to migrate posts again.

kotcrab commented 8 years ago

Turns out tag wikis are written in custom custom formatting language dtext and I'm not really eager to write Markdown converter for it so I won't be adding this feature.

I added batch downloader, option to create comment when bigger version of image was found and option to store source URL and use it to make updating tags on existing posts faster.

rr- commented 8 years ago

@kotcrab OOC is there any difference between gelbooru's and danbooru's tag ecosystem? We've been adopting gelbooru's tag system on our deploy for a while and wonder how it'll play out if we switch to danbooru-based tagging by adopting your tagger.

(There's also this thing about gelbooru hosting more images - perhaps we could extend auto tagger to cover for various engines?)

kotcrab commented 8 years ago

I don't know. Extending wouldn't be big problem but I wonder if using multiple engines at once is going to create inconsistencies in tags. Well that aside Gelbooru API documentation is pretty much non existent.

rr- commented 8 years ago

Looks like the ecosystems are pretty close which is pretty cool. I agree Gelbooru's API is a joke. Maybe we won't miss Gelbooru after all :+1: