Hupotronic / ExLinks

A userscript to make E-Hentai & ExHentai links on 4chan & Foolz archive more useful. Includes ExSauce.
http://hupotronic.github.com/ExLinks/
46 stars 6 forks source link

Added Similarity Scan option re-enabling JPG image search #58

Open Torrentymous opened 9 years ago

Torrentymous commented 9 years ago

This commit adds the option to use the similarity scan. This is always enabled for JPG images on 4chan.

Daiz commented 9 years ago

First of all, this seems like a nice piece of work. For future efforts, though, it'd be nice if you could break something like this into multiple commits instead of just having one big commit.

As for the feature itself... as nice as it seems, it does have one pretty major flaw: It will not help sourcing with black & white images at all since E-Hentai will automatically use hash matching if it detects uploaded images as monochrome. And that's basically the vast majority of things people will be trying to look up source for. As such, people would still be getting zero results for most JPG images and curse "why isn't ExLinks giving me any results for this image", which is what I wanted to reduce by disabling reverse image search for JPG images and adding a note about it.

However, I can see adding this in with a bit of extra work - when looking for results, you could check for the "image was detected as monochrome" text, and if that's present and there's zero results, give some indication about the 4chan image manipulation situation to the users, so they know why exactly they are getting zero results.

Torrentymous commented 9 years ago

The latest commit adds a note when an image was detected as monotone and restores the disabling warning from e2091e2c24bc12c83da678e47d7e000269e7db91 when the monotone image is a JPG from 4chan.

Daiz commented 9 years ago

So I checked this out in action and found a pretty major issue. For some reason, the POST gm_xmlhttpRequest seems to use regular XMLHttpRequest in Chrome and as a results it gets cross-origin blocked over HTTP and mixed content blocked over HTTPS.

At the same time I noticed that the reverse image search results in general seem to be bugging out somewhat - sometimes the results don't format and the result box also doesn't show up if you hide it and try to open it again...

Daiz commented 9 years ago

Looking into the matter, it unfortunately seems to be the case that making cross-domain POST requests with FormData can't really be straight-up done on Chrome with Tampermonkey. Relevant TM issue. This is a pretty unfortunate situation, as ExLinks should work properly in both Chrome and Firefox (with Tampermonkey and Greasemonkey respectively) at the very least... It might be possible to work around it by constructing the multipart/form-data request in a more manual fashion and embedding the image as base64 with Content-Transfer-Encoding: base64 (assuming that E-Hentai is willing to take image data as base64).

Torrentymous commented 9 years ago

Yes I saw that and managed to manually craft the form data. No matter how unexpectedly hard it was (never forget the leading CRLF of the form data and the "binary" property), the similarity scan still doesn't work on Chrome because E-Hentai says the file is corrupted. Everything works fine on Firefox. However, I am now clueless as to why it won't work. This has probably something to do with Tampermonkey or Chrome somehow messing up the data transmitted even with the "binary" property set to "true".

Torrentymous commented 9 years ago

If you know of a way to debug XHRs made from Tampermonkey that'd be great. On Firefox I use TamperData but all Chrome equivalents didn't log my requests.

Daiz commented 9 years ago

I remember having similar XHR debugging issues in the past with no real solution for it, so I think you might just have to do a bunch of related console.logging.

For the "file corrupted" issue, did you try encoding the image in base64, or does E-Hentai just not accept that for file uploads in the first place?

Torrentymous commented 9 years ago

Actually Wireshark works surprisingly well. I managed to see that the data transmitted is what you'd expect when the request is not sent as binary. So this confirms my first guess. Either Tampermonkey doesn't support that property or there needs to be some extra manipulation needed exclusively for Tampermonkey/Chrome. And using the native sendAsBinary() is out of the question since we need to make cross-origin requests.

As for the base64 encoding I tried it but this doesn't work which is understandable because files are only supposed to be uploaded from the form on their website.

Torrentymous commented 9 years ago

I tried fiddling around with iframes without any success. There may be a workaround but I can't seem to find it. Any idea?

If this was an extension instead of a userscript Chrome would probably be more lenient on its restrictions but as it stands, if you want to include this feature I fear that it would have to be available only for Firefox.

Daiz commented 9 years ago

I'm not against the idea of turning it into an extension for Chrome (while keeping it as a userscript on Firefox), though that'd involve quite a bit of work to get easy compilation going, not to mention getting a Chrome Web Store dev account... I'll think about it.