Currently it creates a huge join of the facility and user dataframes, to get every single possible combination of facilities and users, it then calculates the distances between them, and then ranks by each facility or user (depending on whether you want to find the facility closest to each user, or the user closest to each facilities), then chooses those that have rank == 1.
I think I could circumvent this by using the distance_matrix_cpp function to create the pairwise distances, then use some clever rowwise or columnwise minimization, which would then give the index of the appropriate facility closest to each user, or user closest to each facility.
So now to get distance_matrix_cpp to work correctly.
Currently it creates a huge join of the facility and user dataframes, to get every single possible combination of facilities and users, it then calculates the distances between them, and then ranks by each facility or user (depending on whether you want to find the facility closest to each user, or the user closest to each facilities), then chooses those that have rank == 1.
I think I could circumvent this by using the
distance_matrix_cpp
function to create the pairwise distances, then use some clever rowwise or columnwise minimization, which would then give the index of the appropriate facility closest to each user, or user closest to each facility.So now to get
distance_matrix_cpp
to work correctly.