downthemall / anticontainer

DownThemAll! AntiContainer (Extension to a Firefox, Seamonkey extension)
Mozilla Public License 2.0
93 stars 41 forks source link

plugin for booru style galleries #94

Closed wyrde closed 8 years ago

wyrde commented 9 years ago

The various booru's provide searchable indexes of images displayed as thumbnails. Each thumbnail links to a container page. Standard stuff.

The major problem is that the image thumbnail urls don't provide all the information needed to get to the source image. Additionally, depending on the age of the image, it may have different thumbnail information from older versions of booru.

A userscript can be used (see below for examples) to change the thumbnail urls to sources, but there's the issue of the extensions. The easiest way (that I know of) to check if the extension is correct is to try the URL and see if it is successful or a 404. The problem with that is every time an array of thumbnails is generated (search results, etc), the server gets hit with a bunch of connections as the userscript tries to check for valid URLs. I suppose just a list of URLs can be created without checking, but then downthemall will hammer the server as it goes through the list.

I think it is much better (and politer) for an anticontainer plugin to be used, that will check for valid extensions as dta is running. By checking additional extensions only when a 404 is encountered, this should be much friendlier to the server.

Thumbnails end in jpg, regardless of the source extension. As far as I can tell, the extensions, in order of most use, are: jpg jpeg png ping gif

From what I've seen of the plugins, it seems like it would be best to split this into two functions? One to rewrite the URLs and a second to try different extensions if the original results in a 404?

Actually, the latter plugin/function would solve the issue nicely (when used in conjunction with a userscript like the ones below) and may be helpful in other cases.

Samples

sample url (NSFW) (I tried to make it as safe as possible, but it depends on what was tagged): http://gelbooru.com/index.php?page=post&s=list&tags=2girls+cat_tail+cat_ears+-nude+-cum+-panties+-ahoge+-nip%2a Sample Thumbnail (SFW) http://gelbooru.com/thumbnails/21/aa/thumbnail_21aaa5f06964f03d01f664aca7418853.jpg?2472493

Sample Source Image (SFW) http://simg3.gelbooru.com//images/21/aa/21aaa5f06964f03d01f664aca7418853.png Not all images have simg3, some are simg2. As far as I can tell, the simg# can be dropped. http://gelbooru.com//images/21/aa/21aaa5f06964f03d01f664aca7418853.png works as well.

Sample Container (SFW) http://gelbooru.com/index.php?page=post&s=view&id=2472493

Sample <div> for a thumbnail.

<div><span id="s2472493" class="thumb"><a id="p2472493" href="index.php?page=post&amp;s=view&amp;id=2472493" ><img src="http://gelbooru.com/thumbnails/21/aa/thumbnail_21aaa5f06964f03d01f664aca7418853.jpg?2472493" alt=" 2girls :q animal_ears blonde_hair blue_eyes blue_hair bow candy cat_ears cat_tail chibi dress flower frederica_bernkastel goatman halloween hat jack-o&amp;#039;-lantern kotato lambdadelta lollipop mouth_hold multiple_girls red_eyes rose scythe tail television tongue tongue_out tree umineko_no_naku_koro_ni " border="0" title=" 2girls :q animal_ears blonde_hair blue_eyes blue_hair bow candy cat_ears cat_tail chibi dress flower frederica_bernkastel goatman halloween hat jack-o&amp;#039;-lantern kotato lambdadelta lollipop mouth_hold multiple_girls red_eyes rose scythe tail television tongue tongue_out tree umineko_no_naku_koro_ni  score:0 rating:safe" class="preview"/></a></span>

There are some greasemonkey userscripts that try to alter the thumbnails into source images, but as far as I can tell they all assume the thumbnail extension will match the source image's extension. Greasemonkey examples https://greasyfork.org/en/scripts/752-gelbooru-revamped Revamped does a good job of cleaning up the various thumbnail url names:

    var source1 = origSource.replace(/thumbnail/g,'sample');
    var source2 = OrigSource.replace('thumbs','images').replace('thumbnail_','').replace('http://','http://simg2.');

but (as far as I can tell) doesn't have a way to deal with the different extensions.

https://greasyfork.org/en/scripts/114-directgelf A much simpler userscript, doesn't do as well with the names, but tries to match extensions. Though I have never seen it match them successfully.

(Note: I made various attempts, both from userscript end and plugin, to solve the issue but while I can kind of read java, my inability to actually write code is second only to herding squirrels.)

marianocarrazana commented 8 years ago

This work for me: { "type": "resolver", "ns": "downthemall.net", "prefix": "gelbooru.com", "match": "^http://(?:[\w\d]+\.)?gelbooru\.com/index.php\?.+&s=view.+", "finder": "<img.+src=\"([^\"]+)\".id=\"image\"", "builder": "{1}", "static": true, "gone": "Image doesn't exist" }