Open prihoda opened 4 years ago
I'll try to get to this when I can, but segfaults inside of numba can be very difficult to track down well.
Thanks, let me know if I can help
I also get a segfault on 0.4.2. My conda env:
After switching to 0.3.10 (and not changing anything else) the problem is gone.
Sorry to revive a potentially dead thread, but this issue seems to be rearing its head again. I'm getting a segfault in this exact same spot as well. It started when I began trying to play around with the numba threading layers (setting it tbb
) in order to use UMAP with ProcessPoolExecutor. It happened very suddenly, and now consistently happens whenever I try to run UMAP inside a script, regardless of threading layer or if it is running inside a process pool.
The weird thing is that the seg fault does not occur if I just run UMAP inside of a python terminal, it only occurs when I run it via command line through a script.
The error looks like this:
08/07/2021 08:38:28 AM INFO: Finding disconnections...
Fatal Python error: Segmentation fault
Current thread 0x00007fd5f4381700 (most recent call first):
File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/umap/", line 313 in nearest_neighbors
File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/umap/", line 557 in fuzzy_simplicial_set
File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packa^CSegmentation fault (core dumped)
and my conda environment looks like this:
The following changes seem to have partially fixed my issues. Numba parallelism seems to break compute_membership_strengths
and fast_knn_indices
for whatever reason:
diff --git a/umap/ b/umap/
index 0ebb8f3..824a97d 100644
--- a/umap/
+++ b/umap/
@@ -352,7 +352,7 @@ def nearest_neighbors(
"rhos": numba.types.float32[::1],
"val": numba.types.float32,
- parallel=True,
+ parallel=False,
def compute_membership_strengths(
diff --git a/umap/ b/umap/
index 5eb7ddd..d6d3601 100644
--- a/umap/
+++ b/umap/
@@ -11,7 +11,7 @@ from sklearn.utils.validation import check_is_fitted
import scipy.sparse
def fast_knn_indices(X, n_neighbors):
"""A fast computation of knn indices.
The problem is that this seems to be significantly slower than it was previously which makes sense. Additionally, a second segfault begins to occur on a different set of data at a rather random point in the pynndescent
Python error: Segmentation fault
Thread 0x00007f3d61e58700 (most recent call first):
File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/pynndescent/", line 876 in __init__
Which is a call to this function:
self._neighbor_graph = nn_descent(
mor specifically, line 876 is where self.n_neighbors
is used. I'm really not sure what is going on here, these errors are occuring in a fresh conda environment so I'm kind of at a loss.
Downgrading numba does not fix this issue. Downgrading pynndescent doesn't fix this issue either.
I am getting a
Segmentation fault (core dumped)
with any input data on linux.UMAP version:
(happens with0.4.1
as well)OS:
I also tried running using Binder on ubuntu but there it works all OK.
Traceback with
python -q -X faulthandler
:I added print statements before the
call and got:So the error happens somewhere in the
function.Output of
pip freeze
:Output of
conda env export