Closed ShiroyukiX closed 2 years ago
I wanted to suggest using something like
"image-filter": "not any(t in tags for t in ('tag1', 'tag2', 'tag3', 'tag4'))"
and that would theoretically work, but it doesn't due to how Python handles variable look-ups. You get a "NameError: name 'tags' is not defined" if you try, even though it is defined.
Fixing the root cause of this is possible, but complicated. Would it be OK to have a function that does this check, e.g. something like
"image-filter": "not contains(tags, ('tag1', 'tag2', 'tag3', 'tag4'))"
because adding just that is rather easy.
[..]. Would it be OK to have a function that does this check, e.g. something like
"image-filter": "not contains(tags, ('tag1', 'tag2', 'tag3', 'tag4'))"
because adding just that is rather easy.
I think this is the best solution for such cases, yes..
[..]. Would it be OK to have a function that does this check, e.g. something like
"image-filter": "not contains(tags, ('tag1', 'tag2', 'tag3', 'tag4'))"
because adding just that is rather easy.
I think this is the best solution for such cases, yes..
i tried to use this for both rule34us and the base gelbooru module but the program spits out the following error:
gelbooru: FilterError: Evaluating filter expression failed (NameError: name 'contains' is not defined)
rule34us: FilterError: Evaluating filter expression failed (NameError: name 'contains' is not defined)
this is how i have it written in the config file, which is a copy of the gallery-dl-example.conf
with no changes. Let me know if you need the whole document. Maybe I'm placing it in the wrong spot?
NOTE: I tried these two tags with/without the underscore as a test for this.
"rule34us":
{
"image-filter": "not contains(tags, ('azur_lane', 'genshin_impact'))"
},
"gelbooru":
{
"image-filter": "not contains(tags, ('azur_lane', 'genshin_impact'))"
},
No, you're simply a bit too early, this function is not added yet! 😄
Is this request for excluding certain tags from e.g. filenames or just straight up not downloading the files that contain certain tags?
Is this request for excluding certain tags from e.g. filenames or just straight up not downloading the files that contain certain tags?
this is to avoid downloading files that contain certain tags.
What you tried in https://github.com/mikf/gallery-dl/issues/2446#issuecomment-1081744386 should now work (https://github.com/mikf/gallery-dl/commit/413b77757b13bd4670028eb8a5265dd0d2a86ac9), but be aware that different boorus have different tag structures. Some use underscores, some spaces, and for some it is called tag_string
instead of tags
.
What you tried in #2446 (comment) should now work (413b777), but be aware that different boorus have different tag structures. Some use underscores, some spaces, and for some it is called
tag_string
instead oftags
.
Had to install the latest dev version to test it out and it appears to work. On that note, for the different boorus, what command do I use to determine the tag structure before downloading? I assume -j
.
Also, for tag_string
, i replace (tags, ('tag1', 'tag2'))
with (tag_string, ('tag1', 'tag2'))
for boorus that use this option, if I'm understanding what you're saying is correct.
I may have ran into an issue, though I cannot say if it's a filtering or tag reading problem (or whatever the heck it is).
While attempting to download from rule34us shinano_(azur_lane)
artwork, it for some reason only grabs 10 files. I "sync" my image-filter
and site blacklist for make sure I have both up-to-date and my current setup should have about 90 files download from the search; it does not. I'm certain that what appears in my search will download from previous attempts, so I don't believe it's a tag I filtered out.
EDIT: it does appear that gallery-dl is skipping these files. I tried downloading a specific result from what the search gave me and it doesn't appear in the log nor my directory. I checked the JSON with -j
and none of the tags listed are filtered out.
EDIT2: Checked a new search with an artist and, again, it's downloading less than what my search result is giving (41 / 53). Commenting out the tags fixes the result so I'm assuming it's a tagging issue of some sort; not sure if it's only rule34us that does this.
I may have ran into an issue, though I cannot say if it's a filtering or tag reading problem (or whatever the heck it is).
While attempting to download from rule34us
shinano_(azur_lane)
artwork, it for some reason only grabs 10 files. I "sync" myimage-filter
and site blacklist for make sure I have both up-to-date and my current setup should have about 90 files download from the search; it does not. I'm certain that what appears in my search will download from previous attempts, so I don't believe it's a tag I filtered out.EDIT: it does appear that gallery-dl is skipping these files. I tried downloading a specific result from what the search gave me and it doesn't appear in the log nor my directory. I checked the JSON with
-j
and none of the tags listed are filtered out.EDIT2: Checked a new search with an artist and, again, it's downloading less than what my search result is giving (41 / 53). Commenting out the tags fixes the result so I'm assuming it's a tagging issue of some sort; not sure if it's only rule34us that does this.
I may have figured out the problem, though results may vary.
Gallery-dl skips all sub-categorized versions of a tag if it's filtered with image-filter
. For example, if you blacklist animal
, gallery-dl will skip any image with a related tag (animal_ears
, animal_humanoid
, etc), even if animal
is not used for the image. You have to blacklist the sub-categorized versions only to avoid the issue.
This only appears to happen with rule34us as their tagging system is janky (character tags are under general tags, for instance), but it may happen for other booru sites. Also, defining the first argument as tags_general
seems to work as well.
I'm still not sure if this problem is site-specific or program-specific, though my conjecture favors the former. You may want to test it out for other sites to confirm your search result numbers match the number of files downloaded, and make sure to define your filter as mentioned in https://github.com/mikf/gallery-dl/issues/2446#issuecomment-1083328571 to catch the tags accurately.
I will leave the information here, since there are no more similar topics on the Internet. Perhaps someone will be useful. I needed to exclude two tags at the same time on the e621 site and I did not understand how. But after researching the problem, I came up with the following solution for the filter:
--filter "not ('tag1' in tags['general'] and 'tag2' in tags['general'])"
You can also use it in a config file gallery-dl.conf
:
"image-filter": "not ('tag1' in tags['general'] and 'tag2' in tags['general'])"
or additional example
"image-filter": "not ('tag1' in tags['general'] and ('tag2' in tags['general'] or 'tag3' in tags['general']))"
"image-filter": "not ('tag1' in tags['general'] and ('tag2' in tags['general'] or 'tag3' in tags['general']))"
Code like that insta-boils my piss no matter the language, instead try this untested adaptation of this StackOverflow answer:
"image-filter": "any(True for _e in ('tag1','tag2','tag3') if _e in tags['general'])"
Not sure if you need a list comprehension (any([...])
) for this trick since I didn't test it, I sure hope not.
It should return immediately if a blacklisted keyword is found and being a generator, should use minimal RAM.
(Please excuse my inexperience with github. It's my first time ever posting here.)
How would I go about creating a blacklist of tags for booru sites (rule34us, gelbooru, danbooru, safebooru, etc) or any that use tags/tag-like systems in organizing artwork?
I checked the Issues page and configuration doc extensively and cannot find a solution. I know you can use
--filter "'TAG' not in tag"
but haven't seen any mention of a case with multiple tags besides--filter "'TAG1' not in tag and 'TAG2' not in tag"
. I triedimage-filter
but my understanding of Python and JSON is nonexistent. I tend to search these sites and download by artist or character, blocking any tags I dislike through a blacklist; I can't catch every single weird tag used for the same subject so the blacklist is big. I don't have any account with these websites, if that information is relevant.Is this even possible to begin with? I want to believe there is a solution to filter the results without have to input multiple
and
statements into a command.