juninho12 / freearc

Automatically exported from code.google.com/p/freearc
1 stars 0 forks source link

Fix "Stack space overflow. Use `+RTS -Ksize'" on 14k files compression #269

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
arc create -mx -mt24 -ld384mb --display --cache16mb --workdir=aNFSMount -r 
test.arc someBigDirs*

What is the expected output? What do you see instead?
FreeArc 0.666 Creating archive: test.arc using 
rep:384mb+exe+delta+tempfile+lzma:178mb:normal:bt4:128, $obj => 
rep:384mb+delta+tempfile+lzma:178mb:normal:bt4:128, $text => 
dict:128mb:80%:l8192:m400:s100+lzp:160mb:92%:145:h23:d1mb+ppmd:16:384mb, $wav 
=> tta, $bmp => mm+grzip:3355443b:m1:l2048:h15:a
Memory for compression 1820mb, decompression 992mb, cache 16mb
Compressing 14,459 files, 712,588,416,460 bytes. Processed  23.6%
ERROR: can't allocate memory required for (de)compression in 
lzma:178mb:normal:bt4:128

What version of the product are you using? On what operating system?
FreeArc 0.666, download "Linux: generic binary package (x86)", run on Linux 
server with 24 processing threads and 32Gb Ram:

Please provide any additional information below.
A NFS mount is used for workdir only because of lack of temp disk space on the 
machine.
The files to compress are highly similar text or XML files, usual compression 
ratio around 0,4%
If I reduce the data set to compress, for example by compressing one directory 
(around 250Gb) separately, it works fine.

uname -a:
Linux TT-PROD01 2.6.32-5-amd64 #1 SMP Fri Sep 9 20:23:16 UTC 2011
x86_64 GNU/Linux

CPU:
2 x Intel(R) Xeon(R) X5650 @ 2.67GHz (6 cores each + HyperThreading)

Extract from "top" command:
Mem:  37143648k total, 37034836k used,   108812k free,    38868k buffers
Swap: 29294584k total,     8340k used, 29286244k free, 28388360k cached

Thanks for freearc Bulat,

Marek

Original issue reported on code.google.com by mvanl...@gmail.com on 25 Oct 2011 at 12:00

GoogleCodeExporter commented 9 years ago
freearc on linux can't detect how much memory is available and defaults to 2gb. 
try to use -lc1500m or so to decrease mem. usage. you can the option into 
arc.ini

Original comment by bulat.zi...@gmail.com on 25 Oct 2011 at 8:06

GoogleCodeExporter commented 9 years ago
The extract from "top" command may have inappropriately stated the situation:
Hereunder the result from "free" command:
             total       used       free     shared    buffers     cached
Mem:      37143648   37032588     111060          0      40380   30465908
-/+ buffers/cache:    6526300   30617348
Swap:     29294584       5992   29288592

And an extract from "atop":
[...]
MEM | tot   35.4G | free  106.9M | cache  29.1G | buff   39.4M | slab  129.9M |
SWP | tot   27.9G | free   27.9G |              | vmcom   6.1G | vmlim  45.6G |
[...]

As you see, most of the "used" memory is in fact buffer/cache. So actually, I 
think it wasn't caused by lack of RAM (it would have swapped I guess), but by 
another problem. 

I've also tried -m1 instead of -mx, which also gave a problem but with a 
different message though (and I can't believe there isn't around 200Mb free on 
this 32Gb server ;) :
# arc create -m1 -mt24 -ld384mb --display --cache16mb --workdir=aNFSMount -r 
test.arc someBigDirs*
FreeArc 0.666 Creating archive: test.arc using 4x4:tor:3:2mb:h256kb, $exe => 
exe+4x4:tor:3:2mb:h256kb, $wav => mm:d1+4x4:tor:3:2mb:h256kb:t0, $bmp => 
mm:d1+4x4:tor:3:2mb:h256kb:t0, $compressed => storing
Memory for compression 218mb, decompression 205mb, cache 16mb
Compressing 14,459 files, 712,588,416,460 bytes. Processed 100.0%Stack space 
overflow: current size 80000000 bytes.
Use `+RTS -Ksize' to increase it.

Finally, I am testing with -mx and limitation of compression memory, will post 
the results here as soon as the process is completed (700Gb is a lot to 
compress, even for freearc ;-))

Original comment by mvanl...@gmail.com on 26 Oct 2011 at 8:57

GoogleCodeExporter commented 9 years ago
PS: This issue could probably be merged with issue 167.

Original comment by mvanl...@gmail.com on 26 Oct 2011 at 9:02

GoogleCodeExporter commented 9 years ago
Test do not prove great so far:

Test 1: -lc2gb
--------------
$ arc create -mx -mt24 -lc2gb --display --cache16mb --workdir=aNFSMount -r 
test.arc someBigDirs*

FreeArc 0.666 Creating archive: test.arc using 
rep:1536mb+exe+delta+tempfile+lzma:178mb:normal:bt4:128, $obj => 
rep:1536mb+delta+tempfile+lzma:178mb:normal:bt4:128, $text => 
dict:128mb:80%:l8192:m400:s100+lzp:160mb:92%:145:h23:d1mb+ppmd:16:384mb, $wav 
=> tta, $bmp => mm+grzip:8mb:m1:l2048:h15:a
Memory for compression 1820mb, decompression 1544mb, cache 16mb
Compressing 14,459 files, 712,588,416,460 bytes. Processed  23.6%
ERROR: can't allocate memory required for (de)compression in 
lzma:178mb:normal:bt4:128

Test 2: -ld384mb -lc1500mb
--------------------------
$ arc create -mx -mt24 -ld384mb -lc1500mb --display --cache16mb 
--workdir=aNFSMount -r test.arc someBigDirs*

FreeArc 0.666 Creating archive: test.arc using 
rep:384mb+exe+delta+tempfile+lzma:130mb:normal:bt4:128, $obj => 
rep:384mb+delta+tempfile+lzma:130mb:normal:bt4:128, $text => 
dict:128mb:80%:l8192:m400:s100+lzp:160mb:92%:145:h23:d1mb+ppmd:16:384mb, $wav 
=> tta, $bmp => mm+grzip:3355443b:m1:l2048:h15:a
Memory for compression 1364mb, decompression 992mb, cache 16mb
Compressing 14,459 files, 712,588,416,460 bytes. Processed   99.9%
Stack space overflow: current size 80000000 bytes.
Use `+RTS -Ksize' to increase it.

Except the fact that it still fails despite the compression memory limits, what 
I also don't understand is that why does it state in test 2 "decompression 
992mb" while I put option "-ld384mb" ?

Original comment by mvanl...@gmail.com on 28 Oct 2011 at 7:32

GoogleCodeExporter commented 9 years ago
>I think it wasn't caused by lack of RAM

it's not lack of RAM, it's lack of contiguous address space for this 32-bit 
program

>Use `+RTS -Ksize' to increase it.

this time you have no problems with contiguous address space, it's just what it 
says - add "+RTS -K100m" to cmdline. although i don't seen this problem on 
windows, i should look into it

>PS: This issue could probably be merged with issue 167.

yes. i will make this issue about +RTS-K problem, that looks more serious. and 
since i don't know how to detect largest memory block available to Linux 
program, i will just add "use -lc/-ld to limit memory usage" to the "can't 
allocate memory required for (de)compression" error message

>why does it state in test 2 "decompression 992mb" while I put option 
"-ld384mb" 

freearc uses several compression algos sequentially. it may then decompress 
them sequentially, writing intermediate data to the disk, or all at the same 
time. if, for example, there are 3 algos algos require 384 mb each, them 
minimum amount of memory required for decompression (regulated by -ld) will be 
384 mb, and maximum amount of memory that may be used for decompression 
(printed here) is 3*384 mb

Original comment by bulat.zi...@gmail.com on 30 Oct 2011 at 1:24

GoogleCodeExporter commented 9 years ago
Tried with:
$ arc create -mx -mt8 -lc1700mb --display --cache16mb --workdir=aNFSMount -r 
test.arc someBigDirs* +RTS -K100m

FreeArc 0.666 Creating archive: test.arc using 
rep:1444mb+exe+delta+tempfile+lzma:147mb:normal:bt4:128, $obj => 
rep:1444mb+delta+tempfile+lzma:147mb:normal:bt4:128, $text => 
dict:128mb:80%:l8192:m400:s100+lzp:160mb:92%:145:h23:d1mb+ppmd:16:384mb, $wav 
=> tta, $bmp => mm+grzip:8mb:m1:l2048:h15:a
Memory for compression 1708mb, decompression 1452mb, cache 16mb
Compressing 14,459 files, 712,588,416,460 bytes. Processed  99.9%Stack space 
overflow: current size 100000000 bytes.
Use `+RTS -Ksize' to increase it.

Now trying again with -K512m  .... :(

Original comment by mvanl...@gmail.com on 4 Nov 2011 at 3:19

GoogleCodeExporter commented 9 years ago
i siggest you to descrease -lc value first. -lc1g or so

Original comment by bulat.zi...@gmail.com on 4 Nov 2011 at 4:39

GoogleCodeExporter commented 9 years ago

Original comment by bulat.zi...@gmail.com on 5 Nov 2011 at 2:20

GoogleCodeExporter commented 9 years ago

Original comment by bulat.zi...@gmail.com on 5 Nov 2011 at 2:37

GoogleCodeExporter commented 9 years ago
With -K512m ... :-)
$ arc create -mx -mt8 -lc1700mb --display --cache16mb --workdir=aNFSMount -r 
test.arc someBigDirs* +RTS -K512m

FreeArc 0.666 Creating archive: test.arc using 
rep:1444mb+exe+delta+tempfile+lzma:147mb:normal:bt4:128, $obj => 
rep:1444mb+exe+delta+tempfile+lzma:147mb:normal:bt4:128, $text => 
dict:128mb:80%:l8192:m400:s100+lzp:160mb:92%:145:h23:d1mb+ppmd:16:384mb, $wav 
=> tta, $bmp => mm+grzip:8mb:m1:l2048:h15:a
Memory for compression 1708mb, decompression 1452mb, cache 16mb
Compressed 14,459 files, 712,588,416,460 => 4,871,896,783 bytes. Ratio 0.6%
Compression time: cpu 153876.99 secs, real 103658.75 secs. Speed 6,874 kB/s
All OK

Not very quick for such a "monster" server (I got much better results, around 
14Mb/s, with just -mx on "someSmallerDirs*", strange), but at least it works :-)

Next test, for thorougness: the -lc1g suggestion, without +RTS -Ksize... (in 
progress)

Original comment by mvanl...@gmail.com on 5 Nov 2011 at 10:17

GoogleCodeExporter commented 9 years ago
oh, i didnt' realize that you have compressed a lot of data. now it seems a 
duplicate of http://code.google.com/p/freearc/issues/detail?id=253

i should try it myself :)

Original comment by bulat.zi...@gmail.com on 5 Nov 2011 at 10:35

GoogleCodeExporter commented 9 years ago
Can you check that memry usage was increased due operation?

you can try with -m1 to increase the speed

Original comment by bulat.zi...@gmail.com on 5 Nov 2011 at 10:36

GoogleCodeExporter commented 9 years ago
I did not monitor memory usage for the other runs except for the initial error 
reporting. 
Yes indeed, I try to compress a lot of data (trips data for the travel 
industry) which as you can see compress very well (data is generated often but 
does not change that much). I'm testing on 3 weeks data.

For further tests I will try with -m1 so I can give feedback more often than 
once in 24h or so due to compressing taking so long :-)

Original comment by mvanl...@gmail.com on 6 Nov 2011 at 12:20

GoogleCodeExporter commented 9 years ago
Last try with -mx (compression size limited to 1Gb):

$ arc create -mx -mt8 -lc1gb --display --cache16mb --workdir=aNFSMount -r 
test.arc someBigDirs*

FreeArc 0.666 Creating archive: test.arc using 
rep:768mb+exe+delta+tempfile+lzma:89mb:normal:bt4:128, $obj => 
rep:768mb+delta+tempfile+lzma:89mb:normal:bt4:128, $text => 
dict:128mb:80%:l8192:m400:s100+lzp:160mb:92%:145:h23:d1mb+ppmd:16:384mb, $wav 
=> tta, $bmp => mm+grzip:8mb:m1:l2048:h15:a
Memory for compression 992mb, decompression 992mb, cache 16mb
Compressing 14,459 files, 712,588,416,460 bytes. Processed  99.9%Stack space 
overflow: current size 80000000 bytes.
Use `+RTS -Ksize' to increase it.

Test with -m1 (I guess the -lc here makes no sense)
$ arc create -m1 -mt24 -lc1gb --display --cache16mb --workdir=aNFSMount -r 
test.arc someBigDirs*

FreeArc 0.666 Creating archive: test.arc using 4x4:tor:3:2mb:h256kb, $exe => 
exe+4x4:tor:3:2mb:h256kb, $wav => mm:d1+4x4:tor:3:2mb:h256kb:t0, $bmp => 
mm:d1+4x4:tor:3:2mb:h256kb:t0, $compressed => storing
Memory for compression 218mb, decompression 205mb, cache 16mb
Compressing 14,459 files, 712,588,416,460 bytes. Processed 100.0%Stack space 
overflow: current size 80000000 bytes.
Use `+RTS -Ksize' to increase it.

Bulat, is there anything more I could test out, or is the issue clear enough 
now?
I could dig a little into the "atop" unix utility, which seems to log things 
like memory usage in the background. Would that help you?

Original comment by mvanl...@gmail.com on 7 Nov 2011 at 10:13

GoogleCodeExporter commented 9 years ago
Latest example:

arc create -m4 -mt24 -lc1000m --display --cache16mb --workdir=aNFSMount -r 
test.arc someBigDirs* +RTS -K256m
FreeArc 0.666 Creating archive: test.arc using 
rep:96mb+exe+delta+tempfile+4x4:i1:lzma:16mb:h4mb:normal:24:mc8, $obj => 
rep:96mb+delta+tempfile+4x4:i1:lzma:16mb:h4mb:normal:24:mc8, $text => 
grzip:4854518b:m1:l32:h15, $compressed => rep:96mb+tempfile+4x4:i3:tor:16mb:c3, 
$wav => tta, $bmp => mm+grzip:4854518b:m1:l2048:h15:a
Memory for compression 1002mb, decompression 918mb, cache 16mb
Compressing 14,459 files, 712,588,416,460 bytes. Processed  23.6%
ERROR: can't allocate memory required for (de)compression in 
4x4:i1:lzma:16mb:h4mb:normal:24:mc8

So not a stack overflow, but again the original problem, where even though 
there is in fact plenty of free memory (32Gb host used for now solely to test 
freearc), freearc could not allocate required memory.

Original comment by mvanl...@gmail.com on 14 Nov 2011 at 7:56