Open GoogleCodeExporter opened 9 years ago
i think i know why this error happens. i am mixing a bit of matlabs mex memory
allocs with c libraries allocs; usually matlab mex works fine until there are
some allocs that are not freed correctly, and after a while matlab will just
segfault. i think i have some allocs that are not deallocated correctly.
i will take a look at it and fix the bug. also, i found the solution of the bug
from the other issue and will post that out too.
thanks a lot.
Original comment by abhirana
on 11 Nov 2011 at 8:03
hi
can you checkout the svn source? it has the fix for the memory issues.
i ran for a few thousand iterations (parfor) with your dataset and parameters
and it didnot crash. so i think it should be fine now. i used visual studio
2010 for compiling on 64bit windows 7 and matlab 7.12 with 16gb ram.
thanks
Original comment by abhirana
on 11 Nov 2011 at 10:13
Hi, I checked for 15K iterations, and it worked well. Thanks!
It it won't work in my application, I'll let you know.
Original comment by shapoval...@gmail.com
on 14 Nov 2011 at 8:19
Hi,
I can't compile this new version (mxCallalloc, mxfree) on linux 64 bit system.
Here is the error:
g++ -fpic -O2 -funroll-loops -msse3 -Wall -c src/classTree.cpp -o
tempbuild/classTree.o
src/classTree.cpp:42:20: error: matrix.h: No such file or directory
src/classTree.cpp: In function ‘void catmax_(double*, double*, double*, int*,
int*, int*, double*, int*, int*, int*, int*)’:
src/classTree.cpp:102: error: ‘mxCalloc’ was not declared in this scope
src/classTree.cpp:145: error: ‘mxFree’ was not declared in this scope
src/classTree.cpp: In function ‘void predictClassTree(double*, int, int,
int*, int*, double*, int*, int*, int, int*, int, int*, int*, int)’:
src/classTree.cpp:226: error: ‘mxCalloc’ was not declared in this scope
src/classTree.cpp:258: error: ‘mxFree’ was not declared in this scope
Thanks,
MJ
Original comment by m.se...@gmail.com
on 14 Mar 2012 at 8:44
hi m.seyed
could you try out the latest source? i fixed the makefile.
do tell if you still have any issues.
thanks for letting me know about the bug.
Original comment by abhirana
on 14 Mar 2012 at 9:16
Hi Abhirana,
Thanks for the quick reponse. It is now working.
Best,
MJ
Original comment by m.se...@gmail.com
on 14 Mar 2012 at 9:21
Hi,
Still when I'm trying to train a random forest model on a 2000000 by 2000
matrix I get a segmentation fault error in matlab. I just thought to report the
problem here.
My inputs are integer numbers( I changed them to double type before passing to
random forest) and the range varies from 0 to 3500.
Best,
MJ
Original comment by m.se...@gmail.com
on 14 Mar 2012 at 10:22
Hi MJ
do you have enough RAM? i am guessing your dataset requires atleast 32GB just
for storing the array and another 32-50GB for the internal working of RF
(considering 8 bytes for a double and 4 billion examples.
also this might just be too large of a dataset for RF to handle and finish in a
reasonable time.
Original comment by abhirana
on 14 Mar 2012 at 10:34
Yes, I have 2TB RAM otherwise I would get memory error from matlab.
I agree that this one is a huge dataset but I was curious about the performance
of RF with only few trees (say 5 trees).
Anyway, thanks for your code. I tried it on other smaller datasets and it
worked fine.
Best,
MJ
Original comment by m.se...@gmail.com
on 14 Mar 2012 at 10:38
@MJ
sorry i missed your post
yup, RF would work but i dont think this package will finish in any appreciable
time; currently its non-threaded both at the tree level and node level;
multi-threading is on my todo list. maybe a version of RF threaded at
node-level should be able to scale to your dataset. and maybe its not that bad
with a few trees. e.g. kinect uses a version of RF threaded at node-level
http://research.microsoft.com/pubs/145347/BodyPartRecognition.pdf
regards
Original comment by abhirana
on 20 Mar 2012 at 11:17
This is an interesting discussion. I am wondering, is there a rule of thumb to
know what amounts of memory the Matlab function allocates for its internal
needs?
That is, what is the amount of required memory in function of number of samples
and their dimensionality?
Thanks for you work and this great package!
Cheers!
Original comment by vladisla...@gmail.com
on 28 Mar 2012 at 10:31
Hi vladislavs
atleast twice that of the training data. slightly more due to some temporary
variables. but there are 6 more variables that store the tree heirarchy and
that may consume more space than necessary. each of these 6 variables are of
size ntree x nrnodes. where nrnodes=2*n+1. so the total mem requirement is
2xNxD + (ntree)x(2*N+1), N=number of examples, D=num features
for regression, the data is bagged and thus creates a shadow copy of the
training data; then sorting is done for each feature and thus regression scales
as nlog(n) in terms of compute time. for classification, a presorting step is
done for each feature and that helps classification to scale as n, but it still
creates a shadow copy of the training data. (n=number of examples). a todo is
to make regression do that presorting step to allow it to scale in n for
execution time
Original comment by abhirana
on 29 Mar 2012 at 8:42
Hi Abhirana,
Thanks for your answer - it clears out the matters! Should note this somewhere!
Keep up the good work!
Original comment by vladisla...@gmail.com
on 29 Mar 2012 at 9:43
Abhishek,
Do you have plans on releasing the fixed code?
Thanks,
—R
Original comment by romashap...@gmail.com
on 5 Mar 2013 at 10:46
@romashapovalov
the code in the svn is the latest code. i just haven't put up a download link.
a zip file is available here
https://code.google.com/p/randomforest-matlab/issues/detail?id=41#c8
Original comment by abhirana
on 9 Mar 2013 at 7:15
Original issue reported on code.google.com by
shapoval...@gmail.com
on 10 Nov 2011 at 5:09Attachments: