So I was trying to use your dimension reduction pack in relation to hyperspectral imagery as you cited as an application within your benchmarks. So I cloned your repository and ran your script to perform the benchmark and I didn't run into any problems. However, I downloaded the library and wanted to mess around with dimension reduction for hyperspectral imagery and I get this glaring error no matter which similar hyperspectral data set I used - [warning] The neighborhood graph is not connected. This leads my results to having nan values. This is consistent between the tapkee CLI inputs as well as the shogun-toolbox inputs.
The method will run fine except for the [warning] The neighborhood graph is not connected. error that pops up no matter what simple parameters I use, i.e. k-nearest neighbors, target dimension, eigensolver method. Upon closer inspection, I see that the output file is simply a list of nan for each column.
So I tried to vary the dimension reduction techniques, i.e. pca, laplacian eigenmaps, neighborhood preserving projection, local linear embedding, etc, and I found that the eigendecomposition would always fail as an error would pop up saying Some error occured: eigendecomposition failed. Now I varied the knn, the dimension and the methods and I found that the only methods that would produce a solution with actual values in the output file were the stochastic methods; e.g. t-stochastic neighbourhood embedding and stochastic proximity embedding; and MDS method. Sometimes my solution would be nonsense - like if the input data resides between 0 and 1, the output data should as well and sometimes these algorithms didn't produce that - but that is to be expected with stochastic methods. And the MDS will put out some nan values as well but not all of them.
I also tried to use my own hyperspectral data set albeit it was much bigger; 145x145 with 200 dimensions. I created a flattened image of it so that the dimensions were 21025x200. However, I still got the same errors except it just took longer to process.
I even get the same issue if I try to use the Shogun-Toolbox to try and enter the same data sets via Python. Same error except it will not even produce a result and stops the algorithm altogether.
Question
So, what did you do for your graph that produced actual results? I looked into your script and I didn't find any extra commands that would allow you overcome this error. Maybe I am missing something in your script for using Tapkee on the aviris dataset? What commands should I enter or vary so that I can get a sensible solution with your dataset and package? Is there some sort of preprocessing step that you (or I could) do to avoid this issue?
Overview
Hello!
So I was trying to use your dimension reduction pack in relation to hyperspectral imagery as you cited as an application within your benchmarks. So I cloned your repository and ran your script to perform the benchmark and I didn't run into any problems. However, I downloaded the library and wanted to mess around with dimension reduction for hyperspectral imagery and I get this glaring error no matter which similar hyperspectral data set I used - [warning] The neighborhood graph is not connected. This leads my results to having nan values. This is consistent between the tapkee CLI inputs as well as the shogun-toolbox inputs.
Specific Issue
So let's say I run this command in the CLI:
_./tapkee -i iaviris.dat -o avirisdimred.dat -m isomap -td 20 -k 25 --benchmark
The method will run fine except for the [warning] The neighborhood graph is not connected. error that pops up no matter what simple parameters I use, i.e. k-nearest neighbors, target dimension, eigensolver method. Upon closer inspection, I see that the output file is simply a list of nan for each column.
So I tried to vary the dimension reduction techniques, i.e. pca, laplacian eigenmaps, neighborhood preserving projection, local linear embedding, etc, and I found that the eigendecomposition would always fail as an error would pop up saying Some error occured: eigendecomposition failed. Now I varied the knn, the dimension and the methods and I found that the only methods that would produce a solution with actual values in the output file were the stochastic methods; e.g. t-stochastic neighbourhood embedding and stochastic proximity embedding; and MDS method. Sometimes my solution would be nonsense - like if the input data resides between 0 and 1, the output data should as well and sometimes these algorithms didn't produce that - but that is to be expected with stochastic methods. And the MDS will put out some nan values as well but not all of them.
I also tried to use my own hyperspectral data set albeit it was much bigger; 145x145 with 200 dimensions. I created a flattened image of it so that the dimensions were 21025x200. However, I still got the same errors except it just took longer to process.
I even get the same issue if I try to use the Shogun-Toolbox to try and enter the same data sets via Python. Same error except it will not even produce a result and stops the algorithm altogether.
Question
So, what did you do for your graph that produced actual results? I looked into your script and I didn't find any extra commands that would allow you overcome this error. Maybe I am missing something in your script for using Tapkee on the aviris dataset? What commands should I enter or vary so that I can get a sensible solution with your dataset and package? Is there some sort of preprocessing step that you (or I could) do to avoid this issue?
Thank you for your time!