Euclidean Distance Optimizaton for n-Dimensions

Great idea! I checked the paper and I came across with the following possible optimization: The EuclideanDistance between two points, each of them having 128 dimensions, keeps getting called several times across the application. Its computation possibly takes a significant amount of time and that is not good for our algorithm complexity. What I suggest is:

Whenever we compute a specific distance between two points A and B we store the distance of these points inside a hash table. By doing this, we make sure that whenever we need that distance again later in the program, we simply take it from the hash table instead of computing it again.

For the hash table data structure we can use the std::map structure from the STL library in the following way: Since all the calculation for the euclidean distance between two points is inside a specific function called EuclideanDistance(), I suggest we put some extra functionality before the main computation code that checks whether the requested distance has already been computed and is stored inside the hash table and if not, then compute it and store it. The code should look something like the following:

double euclideanDistance(const DataVector<float>& a, const DataVector<float>& b){
    if (a.getDimension() != b.getDimension()) {
        throw std::invalid_argument("Vectors must have the same dimension");
    }

    // Code to check if the given two points exists inside the hash table as a key, and if that is the case
    // return the value of this key pair, which is their distance.

    double sum = 0.0;
    unsigned int dimension = a.getDimension();
    for (unsigned int i = 0; i < dimension; ++i) {
        double diff = a.getDataAtIndex(i) - b.getDataAtIndex(i);
        sum += diff * diff;
    }

   // Code that will add the two given points as a key pair to the hash table, 
   // and the new computed distance as a value to this key pair

    return sqrt(sum);
}

@StavrosJKw take a look at my suggestion and tell me what you think?

AntonisZks / Information-Systems-Software-Development-Project

Euclidean Distance Optimizaton for n-Dimensions #65