biovault / HDILib

HDILib is a library for the scalable analysis of large and high-dimensional data.
MIT License
6 stars 3 forks source link

Int overflow in void HierarchicalSNE<scalar_type, sparse_scalar_matrix_type>::initializeFirstScale() for large data #28

Open thoellt opened 3 years ago

thoellt commented 3 years ago

In void HierarchicalSNE<scalar_type, sparse_scalar_matrix_type>::initializeFirstScale() there is the danger of an integer overlow in the for loop .

int idx = i * nn + n; will overflow for moderately large data (> ~25M points, perplexity 30).

Even unsigned will be rather limited. It should be considered to also move to 64bit indices wherever indexing into data structures with multiple values per datapoint.