Open milmin opened 3 years ago
Hi, Is anybody able to resolve this? I am getting the same error in jupyter notebooks. I tried with both cosine and euclidean distance metrics.
Following the above closed issue, i have also defined cosine as a custom function and used numba. Still getting the same error.
Please help
Hello,
I am still facing this issue, do we know any workaround or when will the bugfix be released?
Thank you!
Hello,
I am still facing this issue, do we know any workaround or when will the bugfix be released?
Thank you!
It's a problem in pynndescent which was fixed in the latest release, try and update that package to 0.5.4
Hello, I am still facing this issue, do we know any workaround or when will the bugfix be released? Thank you!
It's a problem in pynndescent which was fixed in the latest release, try and update that package to 0.5.4
Thank you, but... That's the version I have installed in my environment.
print(pynndescent.__version__) --> 0.5.4
print(umap.__version__) --> 0.5.1
Hi, I am facing this issue in pyspark environment using jupyter notebook. Any dataframe lesser than 4000 rows is working fine but as soon as the number increases this error pops up. Has anybody been able to resolve this? I am also using t-SNE for my single-cell data in pyspark environment which is working fine. I have tried downgrading umap-learn and pynndescent, using different versions and used the latest versions of both. Nothing is helping.
Hello 🙋♂️
I finally found a hack. It seems that there is some weird internal overriding that makes the collections.FlatTree
class not pickeable.
# This is a hack to be able to use UMAP.fit_transform with more than 4095 samples.
# See links below:
# https://github.com/lmcinnes/umap/issues/477
# https://github.com/lmcinnes/umap/issues/547
import collections
import pynndescent
collections.namedtuple("n", [], module=__name__)
pynndescent.rp_trees.FlatTree.__module__ = "pynndescent.rp_trees"
I hope this could help you!
Just copy-paste below the module's imports.
Just copy-paste below the module's imports.
It is working like charm now. 😁 Thank you so much @oscarorti tried it on 5k pbmc for now, will keep the community updated for 68k
A simple fit of umap on a random numpy array of size 4096 (or more) fails with the traceback detailed below. If the array is of size less than 4096 everything goes fine. What's going wrong? A very similar issue: https://github.com/lmcinnes/umap/issues/477
Framework: Spark-based environment with
umap==0.5.0
,numba==0.52.0
,pynndescent==0.5.1
,scipy==1.4.1
Minimal example:
This gives the following error: