Closed dustinstansbury closed 8 years ago
This looks like a nasty issue -- thanks for the detailed report, it will help me track it down much faster. I believe the problem to be in the UnionFind for the single linkage clustering, but I'm not actually sure what has caused the issue there (nothing obvious leaps out at least). I'll try and get some debugging done this evening when I can get access to a linux box.
I have the same problem. I changed to 0.6.4 version and now it works.
Thanks, that's actually helpful right now!
On Thu, Feb 25, 2016 at 8:43 PM, Felipe Moraes notifications@github.com wrote:
I have the same problem. I changed to 0.6.4 version and now it works.
— Reply to this email directly or view it on GitHub https://github.com/lmcinnes/hdbscan/issues/25#issuecomment-189073964.
I believe this is working now; let me know if you still see any issues.
@felipemoraes, thanks for the pointer (pun intended) on version 0.6.4; that seemed to get things working for me on Linux. @lmcinnes, it seems the same munmap_chunk() issue persists in version 0.7.1
That's unfortunate; it seemed to fix the equivalent error I was managing to reproduce on the linux system I was using. I'll look into it further. It seems to involve the way Cython is handling pointers, and I may just have to accept I can't use the approach I was because it doesn't work well on linux.
@dustinstansbury If you get some time and can test the current head in the repository I would be keen to know if that resolves the problem for you. Thanks.
@lmcinnes, it seems that Cython continues to exhibit some problems with pointer handling; the issue persists, even when run from HEAD.
Alright; thanks for that. It seemed to be working for my linux install, but obviously that far from universal. 0.6.4 seems to be working for people so I will try and roll back changes to that for the relevant code and call it done.
Works for me! Thanks for taking the time to look into it.
First off, thanks for your work on the implementation of the algorithm, it's excellent. I do most of my development on OSX and HDBSCAN has worked like a charm.
However, I've recently been trying to deploy some models requiring the package on some EC2 instances and keep getting segmentation faults associated with the munmap_chunk() method. As an example, I've included the output of a test script that I've run either on my development machine, or on the Linux box:
Test Script
The output when run on OSX
Great! Now let's try it on EC2...
The output when run on Linux
This likely isn't a problem with the HDBSCAN package per se, but perhaps how Linux vs OSX allocates/deallocates memory. Unfortunately, I do not have direct access to the Linux box/Docker containers in order to run Valgrind or the like in order to track down the error (they're deployed by a 3rd party service, i.e. the
*** Error in 'python':...***
).One possible solution is to add a compilation flag that checks for memory size, though this may affect performance. Would love to hear your thoughts...