Backup pages packing - Githubissues

GoogleCodeExporter commented 8 years ago

Backup pages storage takes a lot of memory. We should compress them.
The compression should be lossless of course. Compression should be fast 
because it's used very often when you draw. Decompression may be a bit 
slower, but not too much. Advantages we could try to take into acount :

-Calculate the delta between backup steps. The delta should be smaller 
than the picture and easier to compress. > delta should be calculated with 
the NEXT step, so the undo can be fast. once a page is "undoed", keep it 
unpacked so the redo won't have problems depacking it.
-Group pixels. Big area with one single color should take very little 
space (like the starting black screen) > RLE may be a good idea, it's 
simple and fast.
-Maybe re-use parts of our file save/load compressions features, or use 
libz, or something like that. Do we have something fast enough ?

Original issue reported on code.google.com by pulkoma...@gmail.com on 9 Jun 2009 at 2:12

GoogleCodeExporter commented 8 years ago

Be careful with performances. I'd feel safer if it's something we can disable.

This is linked with issue 155, as it will require undo-redo to be a list 
instead of 
a loop.
Be careful that the first undo page has to be unencoded at all times, it's used 
(read) whenever you draw if the "FX feedback" setting is off.

One suggestion for packing: RLE-encode the XOR between the pages. Large 
unmodified 
areas will compress to nothingness automatically, and applying a change twice 
will 
revert it. (so, after an undo, we won't need to compress the reversed 
differences to 
redo them later)

Original comment by yrizoud on 9 Jun 2009 at 3:14

GoogleCodeExporter commented 8 years ago

RLE-XOR seems nice.
If it's too slow, maybe we can pack only old pages, for exemple keep the most 
recent 
4 pages unpacked and pack the older ones. This way, the unpacking can even be 
done 
in a thread if we want it to look fast.
We can let the user set the number of pages he wants to keep unpacked, but that 
kind 
of settings sounds a little too technical for me...

Original comment by pulkoma...@gmail.com on 9 Jun 2009 at 3:21

GoogleCodeExporter commented 8 years ago

Keeping some recent pages unpacked won't make any difference in the long run: 
If you
click N times per second, very regularly, then after some time you need to pack 
N
pages per second to catch up / keep up.

I thought about the encoding a bit more, and it's still going to be costly...
determining the least used xor-color (can be omitted sometimes...), then write 
the
encoded data in a safe big buffer, then allocate exactly the right size, and 
copy the
data in the new place...

Original comment by yrizoud on 9 Jun 2009 at 6:58

GoogleCodeExporter commented 8 years ago

Yes but, that's not the typical use of an undo button ?
Usually you only undo some steps, so you will not reach the first packed page. 
If 
the unpacking is done in a separate thread, then it will continue after you 
stop 
clicking undo.
For the encoding, i think :
1) we could use 255 as the key and don't look for the least used color. This 
will 
work well because we are using XOR and talking about undo pages
When doing RLE, having a wrong key is annoying only when you get patterns with 
many 
occurences of the key, but only once at a time. Ie you dont have FF FF FF (that 
can 
be coded as FF 03 FF), but you have a lot of xx FF FF xx or xx FF xx (that you 
have 
to code as xx FF 02 FF xx or xx FF 01 FF). I don't see cases where you could 
get 
that result in the XOR changes, as most likely you will change either :
-A very small part of the image (small brush stroke)
-A big part of the image with the same single color (floodfill), here you can 
only 
get big continuous area with the same value
-The problematic case is when drawing filled shapes, you replace a big part of 
the 
picture, so you basically get the old part of the picture in your XOR result. 
This 
also happens with the gradients. Here the compression may end up being 
suboptimal, 
but still better thant the original.

About the memory allocation :
XOR+RLE is nice because you can almost always do it in place, overwriting the 
data 
you just packed. Only case where that fails is if your image begins with a lot 
of FF 
xx FF xx ... and your RLE ends up bigger than the compressed data. Maybe we can 
use 
a small temporary buffer in this case, and once we get an RLE sequence that is 
smaller than the original packed data, we can copy it back to the image buffer.

When the packing is finished, just use realloc() to resize the page buffer and 
you're done.
To unpack, we can tweak our RLE so it puts the sequences in reverse, ie 
color;occurences;key, and unpack it backwards, starting from the end of the 
picture. 
This way we don't need any temporary buffer.

Final note : i think it would be wise to use 16bit numbers instead of 8bit when 
packing, because :
1) it allows to use a 16bit occurence counter, which means we are not limited 
to 256 
occurences of the same color, but to 65536
2) the key has less odds to appear on the packed data (1/65536 instead of 1/256)
3) it accounts for simple trames and pack them as well as single colors. This 
helps 
a lot when working with a mode 1 amstrad cpc picture (4 colors) for example, as 
graphicians tends to use trames a lot in these cases.
4) it just means we process the data twice as fast, or more, avoiding lots of 
unaligned memory access

Maybe we could even go for 32bit, but i'm not sure it's as efficient. In 32 
bits, it 
becomes easy to chose a key sequence that will almost never happen on a picture 
(ie 
FF 01 23 45), but in case of miss we have to encode Key + size + Key = 96bits 
for 32.

Original comment by pulkoma...@gmail.com on 9 Jun 2009 at 7:28

GoogleCodeExporter commented 8 years ago

I'm aware that determining efficiently the "magic value" is not so easy. 
Brute-force 
counting is possible, but expensive...

>> If you click N times per second
> That's not the typical use of an undo button ?
I meant, click in the image with the normal freehand tools (wich causes new 
Undo 
pages to be encoded on each mouse click).

Packing pixel pairs: Note that in many simple cases it reduces compression: 
When 
drawing a big solid shape on a monochrome background, each horizontal edge has 
50% 
chance of being on an odd address, thus causing a single "FF 00" or "00 FF" to 
be 
needed. I see what you mean with patterns, but remember that only the XOR 
difference 
between two steps will be RLE-ed, and the pattern will not often be visible in 
what 
is drawn or erased between two steps.

About memory allocation, Maybe we can use a chained list of 1kb blocks (or 
smaller), 
and while encoding we use a new block (or an available one) whenever we reach 
the 
end of current block. I know it will round up the memory usage to the 
next-higher Kb 
for each page, but then there's no longer any reallocation or data copy, and 
after a 
few minutes of drawing, there's no longer any allocations or frees at all.

Original comment by yrizoud on 10 Jun 2009 at 9:33

GoogleCodeExporter commented 8 years ago

Ah right, when drawing it will not change anything. But that's not a problem if 
the 
packing is done in a thread, because you don't need the packed data, so you 
don't 
have to wait for it. And once the main program is idle, the packing thread can 
catch 
up and finish his job.
It's only important to react fast when undoing, there you ask the program to 
unpack 
the data, then want to access it immediately.

Packing pairs : one FF 00 only takes 2 bytes. Having the number of occurences 
coded 
in 2 bytes will help a lot too, becasue with only 1 you can pack chains of 256 
pixels max. So a black 320x200 page will pack as 250 times FF 00 00 (FF being 
the 
key, 00 the number of occurences (256) and 00 the color. But on 16bit it would 
pack 
as FF 42 7D 00 00 00 (FF42 = key, 7D00 = 32000 occurences, 0000 = pixel pair). 
This 
is 125 times smaller :).
We could use single pixels with 8bit key and 2 bytes for the size, too, but it 
sounds slightly more complex to implement, and we have to count FA00 occurences 
for 
a 320x200 image, and anything bigger than 65536 will need two entries (a 
320x240 
image).
As most of the time, there are no big solid shapes drawn (you do that at the 
very 
start of your work, then you go for fine pixelling), i think it's a good idea.
And the 16-bit key has less chances of appearing in an image. So even when 
there is 
no patterns, we get an advantage.

for the packing, we could use a temporary page, write to it, and if it fills 
up, 
don't save the result because it means that the packed page is bigger than the 
original. Of course page headers will have to get a tag to say "this page is 
packed" 
or "this page is not packed". Once the packing is finished, just exchange the 
original page with the packed one in the backup list. The old original page is 
still 
available for undoing until we make a new backup, then it serves as a buffer 
for the 
backup packing of the next one.

Original comment by pulkoma...@gmail.com on 10 Jun 2009 at 9:50

GoogleCodeExporter commented 8 years ago

Original comment by pulkoma...@gmail.com on 15 Sep 2009 at 7:14

Added labels: Milestone-2.2

GoogleCodeExporter commented 8 years ago

I propose to drop the idea.
At the moment, because of the layer system, the pages are now shared by 
different
backup steps : unmodified layers are only stored once.

I made a counter in the "stats" page to display the number of "pages" and the 
total
of their allocations. It can be used to watch over things and see if a problem 
of
memory usage arises.

Original comment by yrizoud on 13 Jan 2010 at 1:00

GoogleCodeExporter commented 8 years ago

ok, postponed to 2.3 which will feature animation and more likely need it.

Original comment by pulkoma...@gmail.com on 13 Jan 2010 at 1:02

Added labels: Milestone-2.3
Removed labels: Milestone-2.2

GoogleCodeExporter commented 8 years ago

Original comment by pulkoma...@gmail.com on 15 Jan 2010 at 7:34

Added labels: Milestone-NotPlanned

GoogleCodeExporter commented 8 years ago

Will likely never happen.

Original comment by yrizoud on 8 Mar 2012 at 7:26

Changed state: WontFix

GoogleCodeExporter commented 8 years ago

Issue 491 has been merged into this issue.

Original comment by pulkoma...@gmail.com on 10 Jun 2012 at 8:20

GoogleCodeExporter commented 8 years ago

There is user interest in more backup pages. Reopening.

Original comment by pulkoma...@gmail.com on 10 Jun 2012 at 8:21

Changed state: Accepted

birdal46 / grafx2

Backup pages packing #174