Cortys / comic-backup

Back up your comics as CBZ.
https://cmxlgy.wordpress.com
GNU General Public License v3.0
296 stars 33 forks source link

Backup has artifact on all pages #73

Closed hwharrison closed 5 years ago

hwharrison commented 6 years ago

Every backup I make has a small artifact in the bottom-right corner of every image. The artifact is ten or fifteen pixels which are the wrong color. They are not always the same color, but they are always against the dominant color of the area (and draw the eye). They are all in the same row, close together but not all touching.

This artifact is not present in the comic reader, and I've reproduced the bug on another computer.

sgbeal commented 6 years ago

Upon very close inspection i'm seeing this, too. It looks almost like a 1-pixel-high digital watermark (and it may well be). i SUSPECT that the online viewer may be disallowing scrolling to that line, effectively trimming it from view (since it's only 1-2 pixels high, it wouldn't be noticed), but keeps it on the canvas so that grabbers like this one end up picking it up.

Screenshot: selection_025

If it is indeed a watermark, i don't consider this to be a bug. Rather, i see it as a sign that Amazon has accepted the fact that people will, some way or another, save the canvas pixels, and that Amazon may stop trying to fight that via continual changes to the online reader.

sgbeal commented 6 years ago

i just went back and checked some of my oldest backups (unfortunately i can't say exactly how old - my file timestamps were all artificially modified at some point for long, boring reasons. They're at least 2-3 years old, though, as i've been using this software since Jan. 2015.). They all seem to have this. i have not compared them pixel-by-pixel in GIMP, but my instinct says that these are digital watermarks, perhaps containing my Comixology user name (which is 6 letters long, and it would be interesting to know if the size of these marks differ, depending on the user's name).

sgbeal commented 6 years ago

More details: i just checked some of the CBZ downloaded via Comixology's own CBZ-format backups and couldn't find this mark on any of those books. i still think it's a watermark added by the online viewer.

Cortys commented 6 years ago

The artifacts are indeed digital watermarks and they are added by the extension.

A few years ago ComiXology actively tried to remove the extension from GitHub because it was used for piracy. To put an end to that, the watermarks and username textfiles in the CBZs were added. Apart from the repo name change from comixology-backup to comic-backup we haven't had any legal issues since.

I don't think it's a good idea to remove the watermarks but if the current implementation is too distracting, I'm open to discuss alternatives.

I already thought about adding watermarking that only changes a few pixels seemingly at random all over the images to make them practically invisible to the human eye but I wasn't sure if that's worth the effort since until now nobody ever complained about it.

Hope this clears things up.

sgbeal commented 6 years ago

This is the first time that anyone has noticed, so it's certainly not "distracting". i would never have noticed them if nobody had pointed them out. If this keeps Amazon happy, i consider it to be a feature.

hwharrison commented 6 years ago

Ah, that makes sense. They're really high-contrast; I'm surprised it doesn't bother anybody else. I do consider stuck pixels a personal nemesis, though, so maybe I'm outside the norm here.

@Cortys : Instead of a more advanced, time-consuming change, would you consider changing the pixel-altering part of getUserImage() from

if(c.charAt(e) * 1 && hsl[2] < 0.65)
    hsl[2] = 0.65;
else if(c.charAt(e) * 1 === 0 && hsl[2] > 0.35)
    hsl[2] = 0.35;

to

if(c.charAt(e) * 1 && hsl[2] < 0.65)
    hsl[2] += 0.15;
else if(c.charAt(e) * 1 === 0 && hsl[2] > 0.35)
    hsl[2] -= 0.15;

That way the watermark is still there, and you can still find it if you're looking, but it doesn't stick out like a sore thumb.

Cortys commented 6 years ago

Relative lightness changes cannot be used to encode the information since they would require the original lightness to be available for reference. I experimented with tighter tolerances in the past (0.45, 0.55) but those regularly caused errors because of the JPEG compression.

Currently I don't have the time to work on this but I will consider adding a less visible watermarking implementation in the next release. Using a pseudo-random number generator with a set of fixed seeds to generate the indices of the pixels that will be used to hold the binary username data should probably work quite well. The watermarking algorithm could try each of its seeds and use the one that causes the least amount of changes.

A PR would of course also be greatly appreciated.

ghost commented 6 years ago

I agree with sgbeal - I have never noticed it until this was pointed out and even now, it doesn't really impact on my reading experience... Such watermarking is also a very small price to pay for certain companies turning a blind eye to this project.