alteredgenome / grafx2

Automatically exported from code.google.com/p/grafx2
0 stars 0 forks source link

24bit PNG support + color reduction speedup #201

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Add support for 24bit PNG.

Some notes on implementation:
-The color reduction&dithering only expect you to provide picture size, 
and a raw array of 24bit pixels. Not so hard to get I guess.
-If there is 256 colors or less, there will be no dithering
-Dithering can also be disabled entirely. Just don't call the floyd 
steinberg function. However, I think it's done at the same time as palette 
mapping. The steps are acutally :

1) get all colors in image with their colorcount
== Color reduction
2) map them in a cube in (R,G,B) space
3) make the cube as small as possible without getting a color out of it 
(most of the time this will not change anything as the image will have 
both black and white in it)
4) split the cube on the color that has the biggest delta
5) start over with the two smaller cubes and iterate until you have 256 
cubes in your colorspace
== Image mapping
6) create a mapping table from each cube to its middle color
7) for each pixel in the original image, replace it with the 24-bit 
replacement color from the cube
8) Compute the difference between the old and new color of the pixel and 
spread it over the right, bottom and bottom-right pixels (that are all 
still with their original 24bit color) << THIS IS THE FLOYD STEINBERG THAT 
SHOULD BE DISABLED
9) Map each 24bit color to its palette index, creating the final image 
(this can be done in the same buffer, and 2/3rd of its memory will be 
freed after it's finished, however using another buffer is more cache 
friendly)

Removing step 8 entirely should be fine and avoid a lot of memory writes. 
The current method does everything in a single buffer, meaning each single 
pixel write will invalidate the CPU cache line and the cache will have to 
be flushed for each pixel. Of course this is slow!
Using separate buffers will be a lot more memory hungry, but also more 
cache friendly.

As it is not really a problem for small pictures, I think we could go for 
a temporary 2-line buffer where we do all the dithering and palette 
mapping, and generate the final image data on the fly instead of having a 
24bit but effectively 256 color image in memory at some point.

The error to add to the right pixel can be stored in a temp var and used 
immediately in the next step, the error to add to the line below can be 
stored in a buffer table. This way we can go for a 1-line buffer for floyd 
steinberg dithering, and no buffer at all if it is disabled

Original issue reported on code.google.com by pulkoma...@gmail.com on 5 Aug 2009 at 2:35

GoogleCodeExporter commented 9 years ago
I can confirm 24bit png loading is going to be straightforward.
About FS dithering, are you proposing to remove it (all the time, or according 
to INI
setting), or to speed it up ?
It's as you wish. I am the only one who reported slowlessness, as it was 
"surprising"
(ie. on my machine, DOS image viewers do it faster), but I know the 24->8bit
reduction is something you only do once per image, before working on it, so 
it's not
critical.

Original comment by yrizoud on 5 Aug 2009 at 3:26

GoogleCodeExporter commented 9 years ago
I think I can improve it, but I'm all for the possibility to disable it 
entirely. As 
both need some code rewrite I think it should be fine to do them at once.
GrafX2 tricks to use less memory and I think this makes it quite unfriendly to 
cache, making it really slow on modern machines. DOS program may not be as well 
hand-
optimized for 486 or be optimized for pentium, meaning they'll take that into 
acount 
and end up being faster.

Original comment by pulkoma...@gmail.com on 5 Aug 2009 at 3:34

GoogleCodeExporter commented 9 years ago
Started 24bit loading

Original comment by yrizoud on 8 Aug 2009 at 8:29

GoogleCodeExporter commented 9 years ago
Seems to work

Original comment by yrizoud on 25 Aug 2009 at 8:44