KeithTheEE / scipy-cluster

Automatically exported from code.google.com/p/scipy-cluster
Other
0 stars 0 forks source link

fclusterdata segfault #37

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Run fclusterdata() on the points loaded from the csv file:

import numpy as np
import hcluster as hc

data = np.loadtxt('/tmp/segfault.csv')
hc.fclusterdata(data, 1)

I get a segfault after a few seconds. Can anyone reproduce? I get the error on 
two machines I tried so far. Problem is that it needs a good amount of RAM to 
be run.

Machine is Linux #59-Ubuntu SMP x86_64 GNU/Linux.
hcluster 0.2.0, python 2.6.5
numpy 1.6.1

Anyway I could help please let me know.

Original issue reported on code.google.com by carlosbe...@gmail.com on 21 Sep 2011 at 1:33

Attachments:

GoogleCodeExporter commented 8 years ago
Thanks for your message and your interest in hcluster. Unfortunately, hcluster 
does not currently support incremental clustering yet so the entire distance 
matrix must be in memory. For your data set, the amount of space required for 
the distance matrix alone (61232.*61231.)/2./1073741824=1.74 GB. How much RAM 
is on your machine?  If I instead cluster the first 1000 points, it works 
without error, h.fclusterdata(X[:1000,:],1).

Original comment by damian.e...@gmail.com on 21 Sep 2011 at 8:20

GoogleCodeExporter commented 8 years ago
I have 48GB of RAM on the server I am running these tests, so it is able to 
handle it. However, I get memory error with my 8GB RAM PC, which is not the 
same as the segfault.
With other data points it seems to work fine, but I am having problems with 
this set that I am sending you.

Is there any particular command I can run to give you more information about 
the segfault?

Thank you.

Original comment by carlosbe...@gmail.com on 21 Sep 2011 at 8:28

GoogleCodeExporter commented 8 years ago
Sorry for the much delayed reply. In IPython, can you try creating an array of 
size 61232 by 61232 by typing

   np.zeros(61231,61231)

Thanks,

Damian

Original comment by damian.e...@gmail.com on 6 Oct 2011 at 7:29