Open stromal opened 3 years ago
You shouldn't be needing anywhere near 256Gb of memory for that computation, so something is clearly going very astray. It may be a joblib related issue and not actually related to memory. Have you looked at the embedding? Does it seem reasonable? If so this may be something deeper with joblib. You can always pass core_dist_n_jobs=1
to hdbscan to force it to keep things single core and avoid joblib.
@lmcinnes I have added core_dist_n_jobs=1
but I have received the am error message.
I'm afraid I don't have too much more advice I'm afraid. This seems quite peculiar.
@lmcinnes I have checked it and if I cut put and us a small dataset part like x30 smaller than it runs without changing in anything. What parameters can I cahnge in HDBCAN to be willing to handles my bigger numpy matrix?
On a different Dataset about 50'000 rows, 28 columns, FLOAT 16 now I have tried withouth core_dist_n_jobs=1
than it have given me the same error message. Than I used core_dist_n_jobs=1
than it have worked.
a.) CODE = Official Tutorial, Official dataset
a.) ERROR = Official Tutorial, Official dataset
b.) CODE for HDBSCAN [OFFICAL DOCUMENTATION(https://hdbscan.readthedocs.io/en/latest/parameter_selection.html)
b.) ERROR for HDBSCAN