Closed GoogleCodeExporter closed 8 years ago
Thanks for the bug report. Sorry for late reply.
I am busy with another project and right now I cannot say when I will get time
to
work on this issue. But I do want to find out what is going on.
Meanwhile please do study source code and try to improve, I can always provide
some
pointers.
Original comment by vijay.pa...@gmail.com
on 26 May 2009 at 2:46
I'm using your code from a while and probably I found the cause of the problems
reported by Jonathan.
There are several memory leaks.
The two that I found till now are on the generation of the nodes:
// main solve function:
BBNode * node = new BBNode(master_lp, (long int)1);
then the same node is "popped out" the list
if(csp_option.search == BFS) {
node = bbnode_set.front();
bbnode_set.pop_front();
} else if(csp_option.search == DFS) {
node = bbnode_set.back();
bbnode_set.pop_back();
}
but at the end of the while loop is not deleted...
the other one is on the pattern list but I'm still investigating on it...
Original comment by ilpanda@gmail.com
on 20 Oct 2009 at 9:16
ilpanda,
Thanks for your efforts. As you pointed out, it looks like node objects are not
being
deleted. It should be okay to delete node object, whenever node is fathomed
while(bbnode_set.empty() == false) {
// whenever node is fathomed
delete node;
}
Right now, BBNode destructor function does nothing. Do we need to release any
memory
inside destructor?
Could you please try, Jonathan's test case and check if by deleting fathomed
nodes,
peak memory usage comes down.? If you successfully resolve the issue, you can
commit
the code.
Original comment by vijay.pa...@gmail.com
on 22 Oct 2009 at 3:33
I have run some initial tests on the stated problem data set which suggest that
deleting fathomed nodes does
not produce a noticeable decrease in the memory usage. I will look at BBNode
more closely and see if destructor
code is required.
Original comment by jonathan...@yahoo.com
on 22 Oct 2009 at 9:40
Unfortunately I have not been able to track down the source of the memory
problems in cspsol (assuming that a leak does exist).
Merely deleting fathomed nodes has negligible effect.
After a day of probing and modifying I was unable to reduce memory usage at all.
The data set above is too troublesome to work with as the memory usage is so
large and the run time too long.
I used the following during my testing:
The data set below takes 30 secs to complete on Intel 1.83GHZ 2GB RAM.
The process consumes 295MB of memory.
6000
3000 16
2400 8
800 12
700 8
600 16
534 16
447 16
334 16
My knowledge of the algorithms involved and the GNU lpkit are much too scant
for me to be effective agent in this case.
I imagine that the original author would be best placed to make headway with
this issue.
Original comment by jonathan...@yahoo.com
on 23 Oct 2009 at 7:52
Dear all,
I was not able to spend much time last week on this, by the way these are the
leaktest detected so far:
file name and (row)
cspsol-0.9\src\pattern.cpp(249)
cspsol-0.9\src\pattern.cpp(234)
cspsol-0.9\src\pattern.cpp(233)
cspsol-0.9\src\bb_node.cpp(221) //I still have to check if this could be solved
cspsol-0.9\src\bb_node.cpp(226) //adding the delete node in the main loop
Original comment by ilpanda@gmail.com
on 26 Oct 2009 at 10:28
This afternoon I had some time to spend on this:
here are two "fast and dirty" solution. Jonathan, please understand that the the
memory cleanup is done just at the end of the "solve_csp" function so the memory
usage improvement is very little.
By the way first patch:
pattern.cpp 166 //this should improve a little bit the memory usage
if(node->check_duplicate(new_pat) == true)
{
delete new_pat;
new_pat = NULL;
}
The second patch is related to the node leaktest. Each time a "good" result is
found
the "solve" function assing to the BestNode pointer the reference of the "good"
node.
This doesn't permit to delete the node itself because at the end of the
solve_csp the
BestNode pointer is for reporting.
I changed a little bit the BBNode class (added the copy function) and the main
csp_solve function (in which I delete every node at the end of the while loop).
Please see attached files: it's too long to describe here.
Third patch is related to the add_init_pattern function: this pattern list is
initialized in the "root" node (Vijay: correct me if I'm wrong!) but never
cleaned
up. You can find the dirty solution in the pattern.cpp attached file.
By the way here the results with my patch and the test case provided by
Jonathan:
cpsol.exe --data ..\data\case7.txt
Branch and bound tree exhausted.
# Total runtime = 512 Secs
# Solution Report #
Best integer obj. func. value = 19
Pattern count = 1: 600 x 1, 534 x 10,
Pattern count = 3: 3000 x 2,
Pattern count = 1: 700 x 1, 534 x 7, 447 x 2, 334 x 2,
Pattern count = 7: 3000 x 1, 2400 x 1, 600 x 1,
Pattern count = 1: 3000 x 1, 800 x 3, 600 x 1,
Pattern count = 2: 3000 x 1, 800 x 2, 700 x 2,
Pattern count = 1: 600 x 7, 447 x 4,
Pattern count = 1: 800 x 1, 700 x 2, 447 x 7, 334 x 2,
Pattern count = 1: 800 x 3, 700 x 1, 447 x 2, 334 x 6,
Pattern count = 1: 2400 x 1, 800 x 1, 447 x 1, 334 x 7
memory usage 710,556Kb
Please note that the program is compiled in release mode and it's running under
windows xp.
Hope this can help.
Francesco
Original comment by ilpanda@gmail.com
on 26 Oct 2009 at 2:45
Attachments:
Hi Francesco
Thanks for putting some time into this.
Unfortunately the mods, as you note, only seem to have a very minor effect on
memory utilisation.
I have compiled and tested the code for both Linux and OS X.
Thanks again
Jonathan
Original comment by jonathan...@yahoo.com
on 27 Oct 2009 at 12:24
I though I would try Running valgrind on OS X 10.5
CL:
valgrind --leak-check=yes ./cspsol --data mt-test.txt
the data set is:
6000
3000 16
1234 4
600 16
534 16
447 16
334 16
Output:
# Total runtime = 256 Secs
# Solution Report #
Best integer obj. func. value = 15
Pattern count = 6: 3000 x 2,
Pattern count = 4: 3000 x 1, 600 x 5,
Pattern count = 2: 600 x 1, 534 x 3, 447 x 7, 334 x 2,
Pattern count = 1: 1234 x 4, 447 x 2,
Pattern count = 1: 534 x 11,
Pattern count = 1: 334 x 17,
About to cleanup:
==26829==
==26829== HEAP SUMMARY:
==26829== in use at exit: 21,005,221 bytes in 1,309,660 blocks
==26829== total heap usage: 5,685,704 allocs, 4,376,044 frees, 74,158,780
bytes allocated
==26829==
==26829== 12 bytes in 1 blocks are definitely lost in loss record 2 of 48
==26829== at 0x46AD3: operator new(unsigned long) (vg_replace_malloc.c:220)
==26829== by 0x2719: solve_csp() (in ./cspsol)
==26829== by 0x2D1D: main (in ./cspsol)
==26829==
==26829== 32 bytes in 4 blocks are possibly lost in loss record 5 of 48
==26829== at 0x46AD3: operator new(unsigned long) (vg_replace_malloc.c:220)
==26829== by 0xC42A: opt_sol_set(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, int) (in ./cspsol)
==26829== by 0xC6FC: get_dp_solution(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, double) (in ./cspsol)
==26829== by 0xB4AD: Pattern::get_new_pattern(BBNode*,
std::vector<OrderWidth*, std::allocator<OrderWidth*> >&, int) (in ./cspsol)
==26829== by 0x5501: BBNode::solve(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, std::deque<BBNode*, std::allocator<BBNode*> >&)
(in ./cspsol)
==26829== by 0x237B: solve_csp() (in ./cspsol)
==26829== by 0x2D1D: main (in ./cspsol)
==26829==
==26829== 200 (72 direct, 128 indirect) bytes in 1 blocks are definitely lost
in loss record 15 of 48
==26829== at 0x46AD3: operator new(unsigned long) (vg_replace_malloc.c:220)
==26829== by 0x2220: solve_csp() (in ./cspsol)
==26829== by 0x2D1D: main (in ./cspsol)
==26829==
==26829== 888 bytes in 37 blocks are possibly lost in loss record 27 of 48
==26829== at 0x472CD: operator new[](unsigned long) (vg_replace_malloc.c:264)
==26829== by 0xB975: KnapSol::KnapSol(KnapSol*) (in ./cspsol)
==26829== by 0xC43F: opt_sol_set(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, int) (in ./cspsol)
==26829== by 0xC6FC: get_dp_solution(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, double) (in ./cspsol)
==26829== by 0xB4AD: Pattern::get_new_pattern(BBNode*,
std::vector<OrderWidth*, std::allocator<OrderWidth*> >&, int) (in ./cspsol)
==26829== by 0x5501: BBNode::solve(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, std::deque<BBNode*, std::allocator<BBNode*> >&)
(in ./cspsol)
==26829== by 0x237B: solve_csp() (in ./cspsol)
==26829== by 0x2D1D: main (in ./cspsol)
==26829==
==26829== 1,624 (640 direct, 984 indirect) bytes in 20 blocks are definitely
lost in loss record 32 of 48
==26829== at 0x46AD3: operator new(unsigned long) (vg_replace_malloc.c:220)
==26829== by 0xC7D1: get_dp_solution(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, double) (in ./cspsol)
==26829== by 0xB4AD: Pattern::get_new_pattern(BBNode*,
std::vector<OrderWidth*, std::allocator<OrderWidth*> >&, int) (in ./cspsol)
==26829== by 0x5501: BBNode::solve(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, std::deque<BBNode*, std::allocator<BBNode*> >&)
(in ./cspsol)
==26829== by 0x237B: solve_csp() (in ./cspsol)
==26829== by 0x2D1D: main (in ./cspsol)
==26829==
==26829== 18,212 (3,168 direct, 15,044 indirect) bytes in 44 blocks are
definitely lost in loss record 45 of 48
==26829== at 0x46AD3: operator new(unsigned long) (vg_replace_malloc.c:220)
==26829== by 0x5A9B: BBNode::branch(std::deque<BBNode*,
std::allocator<BBNode*> >&) (in ./cspsol)
==26829== by 0x2558: solve_csp() (in ./cspsol)
==26829== by 0x2D1D: main (in ./cspsol)
==26829==
==26829== 18,612 (3,240 direct, 15,372 indirect) bytes in 45 blocks are
definitely lost in loss record 46 of 48
==26829== at 0x46AD3: operator new(unsigned long) (vg_replace_malloc.c:220)
==26829== by 0x5B0F: BBNode::branch(std::deque<BBNode*,
std::allocator<BBNode*> >&) (in ./cspsol)
==26829== by 0x2558: solve_csp() (in ./cspsol)
==26829== by 0x2D1D: main (in ./cspsol)
==26829==
==26829== 20,943,512 (5,236,208 direct, 15,707,304 indirect) bytes in 654,526
blocks are definitely lost in loss record 48 of 48
==26829== at 0x46AD3: operator new(unsigned long) (vg_replace_malloc.c:220)
==26829== by 0xC42A: opt_sol_set(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, int) (in ./cspsol)
==26829== by 0xC6FC: get_dp_solution(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, double) (in ./cspsol)
==26829== by 0xB4AD: Pattern::get_new_pattern(BBNode*,
std::vector<OrderWidth*, std::allocator<OrderWidth*> >&, int) (in ./cspsol)
==26829== by 0x5501: BBNode::solve(std::vector<OrderWidth*,
std::allocator<OrderWidth*> >&, std::deque<BBNode*, std::allocator<BBNode*> >&)
(in ./cspsol)
==26829== by 0x237B: solve_csp() (in ./cspsol)
==26829== by 0x2D1D: main (in ./cspsol)
==26829==
==26829== LEAK SUMMARY:
==26829== definitely lost: 5,243,340 bytes in 654,637 blocks
==26829== indirectly lost: 15,738,832 bytes in 654,868 blocks
==26829== possibly lost: 920 bytes in 41 blocks
==26829== still reachable: 21,749 bytes in 105 blocks
==26829== suppressed: 380 bytes in 9 blocks
==26829== Reachable blocks (those to which a pointer was found) are not shown.
==26829== To see them, rerun with: --leak-check=full --show-reachable=yes
==26829==
==26829== For counts of detected and suppressed errors, rerun with: -v
==26829== ERROR SUMMARY: 8 errors from 8 contexts (suppressed: 0 from 0)
I am not adept at reading these reports for STL containers so I am unsure if
the larger of the reported leaks is merely the containers cache rather than an
actual leak.
A part from that leakage seems minimal.
It might be useful to run on a larger data set but my test machine is short on
RAM.
Original comment by jonathan...@yahoo.com
on 27 Oct 2009 at 5:30
Hi Jonathan,
did you run the test with my patch or with the original 0.9 code?
Original comment by ilpanda@gmail.com
on 27 Oct 2009 at 8:14
Hi Francesco
Sorry, I should have said that I ran it against the original 0.9 code.
Original comment by jonathan...@yahoo.com
on 28 Oct 2009 at 1:08
ilpanda,
excellent efforts.
"The second patch is related to the node leaktest. Each time a "good" result is
found
the "solve" function assing to the BestNode pointer the reference of the "good"
node.
This doesn't permit to delete the node itself because at the end of the
solve_csp the
BestNode pointer is for reporting.
I changed a little bit the BBNode class (added the copy function) and the main
csp_solve function (in which I delete every node at the end of the while loop)."
This is likely be the main cause. I am studying your patches, please bear with
me.
I am already excited about the improvements that this patch will have on
benchmarks :
http://code.google.com/p/cspsol/wiki/Benchmarks09
Original comment by vijay.pa...@gmail.com
on 29 Oct 2009 at 4:11
ilpanda,
I studied your patches. I applied and committed changes based on you second and
third
patche. First patch is not relevant for this issue, because that code is
executed
only for --si flag.
Changes are committed.
http://code.google.com/p/cspsol/source/detail?r=39
Please checkout latest code from repository and see if you get any improvements
on
you machine.
The issue is NOT yet resolved. I do not see any significant reduction in memory
usage
for Jonathan's test case. In fact, on my machine I get memory allocation error
after
solving around 9860 nodes.
I still applied your patches because they fix memory leaks. I have identified
few
more memory leaks, but nothing that will explain why it takes so much memory
for J's
test case.
I am not very confident/happy about dynamic programming algo related code. It
could
be improved. I will try to look at some more places to find the cause.
Original comment by vijay.pa...@gmail.com
on 1 Nov 2009 at 3:20
Hi Vijay
Thanks for continuing to work to improve the code.
cspsol seems, AFAIK, to be the only open source CSP solver available out there
so it would be great if we could fix this issue.
I am not an experienced c++/STL programmer so I might not be able to take this
all the way.
The leak may be here, in knapsack.cpp, function op_sol_set().
for(int p = 0; p < ss->get_size(); p++) {
KnapSol * sp = ss->solutions[p];
KnapSol * new_sp = new KnapSol(sp); // this allocation is leaking
new_sp->val[bi] += 1;
new_ss->add_solution(new_sp);
}
The KnapSol destructor does seem to be getting called so instances may be going
astray.
I instrumented the code using the mac OS X developer tool Instruments.
In case you don't have access to a mac I have attached some screen shots.
You can see that the code is leaking hundreds of thousands of 16 and 32 byte
malloc blocks.
The KnapSol instances get pushed onto the KnapSolSet but don't seem to get
popped.
But you do seem to delete them - but my lack of STL knowledge is wanting here.
I might get some time at this later in the week but you will surely might
better progress that I will.
Original comment by jonathan...@yahoo.com
on 1 Nov 2009 at 11:01
Attachments:
Jonathan
Thanks for providing screen shots. I found the memory leak. Following patch
fixes the
memory leak.
Index: knapsack.cpp
===================================================================
--- knapsack.cpp (revision 32)
+++ knapsack.cpp (working copy)
@@ -219,6 +219,10 @@
{
if(this->solution_exists(sol) == false)
solutions.push_back(sol);
+ else {
+ // KnapSol object not needed, so delete it.
+ delete(sol);
+ }
}
Code changes committed, please see :
http://code.google.com/p/cspsol/source/detail?r=40
I ran your test case without any problems :
# Total runtime = 557 Secs
# Solution Report #
Best integer obj. func. value = 22
Pattern count = 2: 3000 x 2,
Pattern count = 1: 700 x 1, 534 x 7, 447 x 2, 334 x 2,
Pattern count = 2: 800 x 1, 447 x 8, 445 x 2, 400 x 1, 334 x
1,
Pattern count = 1: 700 x 1, 447 x 1, 445 x 1, 400 x 1, 334 x
12,
Pattern count = 8: 3000 x 1, 2400 x 1, 600 x 1,
Pattern count = 3: 3000 x 1, 800 x 1, 700 x 2, 400 x 2,
Pattern count = 1: 534 x 11,
Pattern count = 1: 3000 x 1, 600 x 5,
Pattern count = 1: 700 x 1, 445 x 11, 400 x 1,
Pattern count = 1: 800 x 7, 334 x 1,
Pattern count = 1: 600 x 8, 400 x 3,
It is expected that optimal solution may vary on your machine.
Original comment by vijay.pa...@gmail.com
on 2 Nov 2009 at 3:41
Vijay
Patch is a good one. Thanks for following this issue up.
My optimal solution is indeed different.
Branch and bound tree exhausted.
# Total runtime = 642 Secs
# Solution Report #
Best integer obj. func. value = 20
Pattern count = 1: 600 x 1, 445 x 12,
Pattern count = 1: 600 x 10,
Pattern count = 2: 2400 x 2, 800 x 1, 400 x 1,
Pattern count = 5: 3000 x 2,
Pattern count = 2: 600 x 1, 534 x 1, 447 x 8, 445 x 2,
400 x 1,
Pattern count = 1: 3000 x 1, 2400 x 1, 600 x 1,
Pattern count = 4: 3000 x 1, 800 x 1, 700 x 2, 400 x 2,
Pattern count = 1: 2400 x 2, 600 x 2,
Pattern count = 1: 534 x 11,
Pattern count = 1: 3000 x 1, 2400 x 1, 534 x 1,
Pattern count = 1: 800 x 6, 534 x 2,
Original comment by jonathan...@yahoo.com
on 2 Nov 2009 at 11:13
Vijay
seems that now the code is running very well.
I ran the following data set (shoul be one of the example provided by Jonathan)
6000
3000 16
2400 8
800 12
700 8
600 16
534 16
447 16
334 16
with my original patches and the last one:
Branch and bound tree exhausted.
# Total runtime = 133 Secs
# Solution Report #
Best integer obj. func. value = 19
Pattern count = 1: 600 x 1, 534 x 10,
Pattern count = 3: 3000 x 2,
Pattern count = 1: 700 x 1, 534 x 7, 447 x 2, 334 x 2,
Pattern count = 7: 3000 x 1, 2400 x 1, 600 x 1,
Pattern count = 1: 3000 x 1, 800 x 3, 600 x 1,
Pattern count = 2: 3000 x 1, 800 x 2, 700 x 2,
Pattern count = 1: 600 x 7, 447 x 4,
Pattern count = 1: 800 x 1, 700 x 2, 447 x 7, 334 x 2,
Pattern count = 1: 800 x 3, 700 x 1, 447 x 2, 334 x 6,
Pattern count = 1: 2400 x 1, 800 x 1, 447 x 1, 334 x 7,
# Total patterns = 1045
memroy usage: 2.996 KB
and this is without last patch:
Branch and bound tree exhausted.
# Total runtime = 509 Secs
# Solution Report #
Best integer obj. func. value = 19
Pattern count = 1: 600 x 1, 534 x 10,
Pattern count = 3: 3000 x 2,
Pattern count = 1: 700 x 1, 534 x 7, 447 x 2, 334 x 2,
Pattern count = 7: 3000 x 1, 2400 x 1, 600 x 1,
Pattern count = 1: 3000 x 1, 800 x 3, 600 x 1,
Pattern count = 2: 3000 x 1, 800 x 2, 700 x 2,
Pattern count = 1: 600 x 7, 447 x 4,
Pattern count = 1: 800 x 1, 700 x 2, 447 x 7, 334 x 2,
Pattern count = 1: 800 x 3, 700 x 1, 447 x 2, 334 x 6,
Pattern count = 1: 2400 x 1, 800 x 1, 447 x 1, 334 x 7,
# Total patterns = 1045
memory usage 1.457.328 KB
This was impressing!
Seems that finally you can close this issue! :)
Thank you guys!
Francesco
Original comment by ilpanda@gmail.com
on 3 Nov 2009 at 11:15
Bug is fixed, verified, closing the issue. Please use/download latest release.
Original comment by vijay.pa...@gmail.com
on 20 Jan 2010 at 9:09
Original issue reported on code.google.com by
jonathan...@yahoo.com
on 16 May 2009 at 9:47