Closed GoogleCodeExporter closed 8 years ago
Shoot, I forgot to ask: in general, what's the purpose of the (X of Y) result? Are potential discrepancies between the results reported by the server and the amount received, or are items dropped by the program itself based on some criteria?
As an aside, in trying to figure this out, I discovered that Gelbooru reports some incredibly incorrect numbers for the search results on their web page. In counting them out manually, some were reported about 30% higher than the actual number of images on the site. For instance, the engine reports 533 pics for the artist aaaa, but there are only 389 in the results. Grabber reported the correct total here.
Original comment by
PhotonB...@gmail.com
on 7 Jan 2012 at 6:25
In fact, the issue doesn't come from Grabber, but from the danbooru system. For example, you took the example "hat". When you search "hat" directly in the website, it will show you the same amount of pages (35 with 30 images per page, about 1050). But when I go to the page 21 and higher (without being logged in), it gives blank pages (giving only 581 results). I assume some images are hidden for some reason (like loli for danbooru).
So I think you have to login to see them. And to do so you have to set your login and password in the options. But when you hash the password, you must salt it, which means you must add some text to it. For danbooru, it is "choujin-steiner--your-password--". For WC, it seems to be "doubutsunomori--your-password--". For the moment the login system is a little bit shitty, but I plan to improve it in the future.
For gelbooru, I think it's just a bug of their tag list. :)
Finally, about the purpose of "(X of Y)" result, you are right, there are images that are dropped, based on the "post-filtering" field, which acts like a search in the results. :)
Original comment by
bio.nus@hotmail.fr
on 7 Jan 2012 at 6:58
Well, I guess I wasn't logged in after all ^^;;; and they recently made all images sourced from Pixiv accessible to members only. I wasn't aware of the hash/salt issue, but I've been using plaintext for a few other sites, so I'm not sure where I stood with that, I guess. More digging in the WC forums turned up that they have a custom booru, and the admin thought there might be some features that aren't compatible to spec any more without a little tidying. Will pursue it with them when I have a bit more time.
Sorry to bug you with that, but I appreciate your input...never would have found the info on WC or danbooru without knowing I had to look for that API salt.
For the login system, I had been thinking it would indeed be nice to be able to input passwords for more than one site. It might be a little more evident to add some popup dialogues to the existing Sources feature, rather than have a separate page in the options.
Also, on the current login screen, it might be worth clarifying that certain sites might require password hashing and unique salting key. The current message from the API FAQ only implies that it's a good idea for best security practices, not that it's necessary to login. And the hashing dialogue box is in French in the English version, still ^^
Original comment by
PhotonB...@gmail.com
on 9 Jan 2012 at 2:54
Aha, their API help's just wrong in fact. You must hash "choujin-steiner--your-password--" (same as danbooru) and not "doubutsunomori--your-password--". Can't believe I got stuck one hour on this. xD
For the login system, I agree with you, and I have thought of replacing the "Delete" button by a "Edit" button where you would be able to set a pseudo and a password (plus editing the source, of course). What do you think? With a clarification about salting and everything. In fact I admit I added the login option in two minutes and haven't touched it since, so... ^^
And the hashing dialogue box is in French in the English version, still ^^
Thanks, haven't seen this one :X
the admin thought there might be some features that aren't compatible to spec any more without a little tidying
BTW, do you have a link? Can't find anything on their forum.
Original comment by
bio.nus@hotmail.fr
on 9 Jan 2012 at 6:30
I think I know why some image are 'missing' from danbooru (resulting sometime in some blank pages) Danbooru team often delete images they think goes against their rules (not japanese, bad drawing, exagereted proportion, etc).
after deleteing these images, there seem to be some residual information left in the list, resulting in those strange blank pages, and missing images.
you can access to deleted images to a certain limit (tag is status:deleted. maybe only available to some users?). everytime I checked, thoses deleted images were equals to the missing image count. (still, I am not 100% sure this is the origin of that problem)
(thanks for you program, it is really great!)
Original comment by
ser...@hotmail.com
on 15 Jan 2012 at 5:40
Aha, their API help's just wrong in fact. You must hash "choujin-steiner--your-password--" (same as danbooru) and not "doubutsunomori--your-password--". Can't believe I got stuck one hour on his.
haha, wow, I can't believe you did either ^^ thanks so much, it seems to be working with that salt. I really appreciate it...the WC forums seem to have a lot of drama even compared to other boards, so you probably saved me a bunch of aggravation ^_^
BTW, do you have a link? Can't find anything on their forum.
Well, the only hint I could find before was http://wildcritters.ws/forum/show/25537, but I guess that's superseded by your discovery of the correct salt.
For the login system, I agree with you, and I have thought of replacing the "Delete" button by a "Edit" button where you would be able to set a pseudo and a password (plus editing the source, of course). What do you think?
Sounds perfect!
everytime I checked, thoses deleted images were equals to the missing image count.
Ah, ok, that checks out on danbooru for me, too. I can't seem to access deleted images on gelbooru, but I just happened across a thread that mentioned they did some kind of merge with danbooru a few years ago, and the many extra tag results are from the duplicate images that were deleted from that.
Original comment by
PhotonB...@gmail.com
on 17 Jan 2012 at 10:00
Hmmm, ok, I have another question related to WC. In addition to the %character% tag, they also use the %species% tag class. I can't seem to figure out a way to get this into the file name...is there a way to employ novel tag classes like this?
Original comment by
PhotonB...@gmail.com
on 17 Jan 2012 at 12:31
I can't seem to access deleted images on gelbooru ...
Status:deleted always give 0 result on gelbooru : I think the answer to that mystery is simple : gelbooru never delete pictures !
sometime, when doing some search on both danbooru and gelbooru with same tag you have something like this: tag artist 'konoekihei' gelbooru : 15 result danbooru : 11 result (+4 deleted)
missing images on danbooru are deleted images ^^
but generally, danbooru seem to have more images when you only type the name of an artist as a tag. like with "shinozuka_atsuto" (dan=232(+2deleted); gel=194)
exception are generally when the artist does 'disturbing' or 'too crazy' art. Danbooru team seem to delete easily those kind of artwork. exemple artist 'yozora' danbooru : 39 + 5 deleted gelbooru : 162 images (yozora is that disturbing?? or is there other reason?) (usatarou is very crazy => more images on gelbooru!! ^^ XD )
aaaah, the folly of men!
Original comment by
ser...@hotmail.com
on 17 Jan 2012 at 10:02
Well, the only hint I could find before was http://wildcritters.ws/forum/show/25537, but I guess that's superseded by your discovery of the correct salt. So they're using danbooru 1.17.
This answers some of my questions about it, thanks. :)
Hmmm, ok, I have another question related to WC. In addition to the %character% tag, they also use the %species% tag class. I can't seem to figure out a way to get this into the file name...is there a way to employ novel tag classes like this?
No, there is currently no way to add a tag type. But I think I'll add a setting to add such personalized tag types in the future, that's a good idea (since others boorus seems to do so too, such as behoimi).
exception are generally when the artist does 'disturbing' or 'too crazy' art. Danbooru team seem to delete easily those kind of artwork.
I think it is not only related to danbooru team, but also to the uploaders community. Gelbooru is more suited than danbooru for this kind of pictures, not because it would be directly deleted on danbooru (proof is that is has some pictures anyway), but I think because the visitors (and so the uploaders) are just not looking for this kind of images on danbooru. While shinozuka_atsuto seems much more "classical" and as such have more posts on danbooru.
Original comment by
bio.nus@hotmail.fr
on 17 Jan 2012 at 10:49
At the top of the search window, Grabber lists the current and total pages, and the number of results (X out of Y). While using the program with wildcritters.ws, I noticed that these numbers were pretty far out of whack, like (78 out of 129) in an artist batch I just did.
The results number isn't always exact, but it usually seems to be something like plus or minus 2. The program is somehow dropping up to half of the search results, and seems to do so increasingly starting around 70 hits and up (only 581 total out of 1040 when searching 'hat'). Searches with fewer than 20 results seem to be unaffected.
I've tried it with and without login (shouldn't be required, even for adult). I've counted the results out manually on the web site, and confirmed their total numbers are correct. I've also tracked down some of the images that failed to appear/download, and they were just basic jpegs with no distinguishing traits. Nor am I using a blacklist.
I'm not sure if this is unique to WC, or if it might be indicative of a more general problem with the downloader. In any case, I was hoping you might have some better ideas about this, if you might find a moment to look into it at your convenience ^^