Closed alsrbok closed 2 years ago
Actually, I solve the cudaFree pointer issue by changing num_neighbors and perplexity. I have one more question. What is the role of 'num_neighbors'? I guess it might have a similar role with perplexity..
This issue is a CUDA out of memory issue, and suggests that your dataset is too big for the GPU that you're using. Reducing the number of neighbors reduces the GPU memory requirements, but also reduces the quality of the map, since the T-SNE forces are approximated with fewer reference points. Ideally, you can use a small amount of PCA to reduce the dimension of the space that you're computing the T-SNE in prior to running the nearest neighbor search.
When I used TSNE at flickr(graph datset), I met a following problem.
Faiss assertion 'err == cudaSuccess' failed in virtual void faiss::gpu::StandardGpuResourcesImpl::deallocMemory(int, void*) at /home/conda/feedstock_root/build_artifacts/faiss-split_1636459887571/work/faiss/gpu/StandardGpuResources.cpp:518; details: Failed to cudaFree pointer 0x7f89cc000000 (error 700 an illegal memory access was encountered)
What's the problem? It suddenly does not work.
Moreover, When I applied TSNE at ogbn-products( much larger dataset than flickr ), I met followed memory issue.
terminate called after throwing an instance of 'faiss::FaissException' what(): Exception thrown from index 0: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /home/conda/feedstock_root/build_artifacts/faiss-split_1636459887571/work/faiss/gpu/StandardGpuResources.cpp:452: Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type TemporaryMemoryOverflow dev 0 space Device stream 0x55671554dd00 size 3435970560 bytes (cudaMalloc error out of memory [2])
Exception thrown from index 1: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /home/conda/feedstock_root/build_artifacts/faiss-split_1636459887571/work/faiss/gpu/StandardGpuResources.cpp:452: Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type TemporaryMemoryOverflow dev 1 space Device stream 0x5567a1e7b460 size 3435970560 bytes (cudaMalloc error out of memory [2])
Exception thrown from index 2: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /home/conda/feedstock_root/build_artifacts/faiss-split_1636459887571/work/faiss/gpu/StandardGpuResources.cpp:452: Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type TemporaryMemoryOverflow dev 2 space Device stream 0x5567c564fe90 size 3435970560 bytes (cudaMalloc error out of memory [2])
Is there any solution for memory issue?