angeloschatzimparmpas / t-viSNE

t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections
https://doi.org/10.1109/TVCG.2020.2986996
MIT License
17 stars 6 forks source link
dimensionality-reduction explainable-machine-learning high-dimensional-data interpretable-tsne visualization

t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections

This Git repository contains the code that accompanies the research paper "t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections". The details of the experiments and the research outcome are described in the paper.

Note: t-viSNE is optimized to work better for standard resolutions (such as 1440p/QHD (Quad High Definition) and 1080p). Any other resolution might need manual adjustment of your browser's zoom level to work properly.

Note: The tag paper-version matches the implementation at the time of the paper's publication. The current version might look significantly different depending on how much time has passed since then.

Note: This software is based on the bhtsne library, its native executable and the python interface that is used to call the native executable. This library is the official implementation of t-SNE, made by its authors. Using the exact same input data, different systems will generate slightly different outputs in this library, and such differences will propagate to our software.

Note: As any other software, the code is not bug free. There might be limitations in the views and functionalities of the tool that could be addressed in a future code update.

Data Sets

All data sets used in the paper are in the data folder, formatted as comma separated values (csv). Most of them are available online from the UCI Machine Learning Repository: Iris, Breast Cancer Wisconsin (Original), Pima Indians Diabetes, and SPECTF. We also used a custom-made data set with Gaussian clusters.

Requirements

For the backend:

You can install all the backend requirements with the following command:

pip install -r requirements.txt

For the frontend:

There is no need to install anything for the frontend, since all modules are in the repository.

Usage

Below is an example of how you can get t-viSNE running using Python for both frontend and backend. The frontend is written in JavaScript/HTML, so it could be hosted in any other web server of your preference. The only hard requirement (currently) is that both frontend and backend must be running on the same machine.

# first terminal: hosting the visualization side (client)
# for Python3
python3 -m http.server 

or

# for Python2
python -m SimpleHTTPServer 8000
# second terminal: hosting the computational side (server)
FLASK_APP=tsneGrid.py flask run

Then, open your browser and point it to localhost:8000. We recommend using an up-to-date version of Google Chrome.

Reproducibility of the Results

The following instructions describe how to reach the results present in Figure 1 of the article. The aforementioned figure is connected with the Subsection 5.2 (Use Case: Improving Diabetes Classification) and is the main use case described in the paper.

Note: We used OSX and Google Chrome in all our tests, so we cannot guarantee that it works in other OS or browser. However, since t-viSNE is written in JS and Python, it should work in all the most common platforms.

Tip: The Reset Filters button illustrated in Figure 1(h), resets all the applied interactions in case you made a mistake and you want to redraw something.

Outcome: The above process describes how you will be able to reproduce precisely the results presented in Figures 1 and 7 of the paper. Thank you for your time!

Corresponding Author

For any questions with regard to the implementation or the paper, feel free to contact Angelos Chatzimparmpas.