sarwanpasha commented 4 years ago

So I am working in Rstudio to solve TSP by giving edgelist as input (as per your instruction in one of my previous question). Here is my code

library("igraph") library("TSP")

edgelist <- structure(list(V1 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L, 14L, 14L, 14L, 15L, 15L, 15L, 16L, 16L, 17L, 17L, 18L, 19L, 19L, 20L, 20L, 21L, 22L, 22L, 22L, 23L, 23L, 24L, 24L, 24L, 25L, 25L, 26L, 26L, 26L, 27L, 27L, 28L, 28L, 28L, 29L, 29L, 29L, 30L, 30L, 32L, 32L, 33L, 33L, 33L, 34L, 34L, 37L, 38L, 38L, 39L, 40L, 41L, 41L, 42L, 43L, 46L, 47L, 48L, 48L, 49L, 53L, 54L, 58L, 64L), V2 = c(3L, 9L, 61L, 17L, 31L, 51L, 40L, 46L, 42L, 47L, 63L, 8L, 18L, 39L, 30L, 40L, 62L, 13L, 31L, 58L, 50L, 63L, 25L, 35L, 16L, 27L, 44L, 19L, 45L, 54L, 21L, 47L, 55L, 51L, 66L, 35L, 57L, 61L, 18L, 20L, 63L, 52L, 53L, 21L, 37L, 23L, 52L, 56L, 32L, 57L, 34L, 38L, 44L, 60L, 49L, 57L, 36L, 56L, 62L, 36L, 46L, 43L, 60L, 64L, 60L, 65L, 44L, 45L, 52L, 31L, 37L, 41L, 54L, 56L, 35L, 36L, 43L, 48L, 66L, 39L, 50L, 55L, 45L, 59L, 49L, 59L, 42L, 58L, 55L, 65L, 62L, 50L, 51L, 53L, 61L, 65L, 59L, 64L, 66L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79", "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90", "91", "92", "93", "94", "95", "96", "97", "98", "99"))

g <- graph_from_edgelist(as.matrix(edgelist)) plot(g)

adj_mat = as_adjacency_matrix(g, sparse = FALSE)

d <- as.dist(1/(1+adj_mat))

tsp <- TSP(d)

o <- solve_TSP(tsp) o

o2 <- solve_TSP(tsp, method="concorde",rep=10, control = list(clo = "-V")) o2

matrix_tour = as.matrix(o) colnames(matrix_tour) = c("v1") pred_tour = matrix_tour[order(data.frame(matrix_tour)$v1),]

new_edge_list = matrix(0,length(matrix_tour),2) for(i in 1:length(matrix_tour)){ new_edge_list[i,1] = matrix_tour[i,1] if(i!=length(matrix_tour)) { new_edge_list[i,2] = matrix_tour[i+1,1] } else{ new_edge_list[i,2] = matrix_tour[1,1] } } g2 <- graph_from_edgelist(as.matrix(new_edge_list)) plot(g2)

If you plot g and g2, you can see that the g2 does not contain the edges from the original g graph. This means that the optimal tour find by the TSP does not have those edges from the original graph.

My question is that how can we restrict the TSP solver to give us the tour, which only contains edges from the original graph, else return that no tour found.

Note: I have tried to replace 0 with some higher weight in the adjacency matrix, however, it still give same problem as described above.

Edit: I have also posted that question on StackOverflow, you can check it on the following URL https://stackoverflow.com/questions/61260887/tsp-not-giving-optimum-tour-using-r-language

mhahsler commented 4 years ago

I have updated the code on StackOverflow. All the nodes in your graph have degree 3. There might be no solution.

sarwanpasha commented 4 years ago

Can you please check the code below. Actually the problem which I have diagnosed is that in case of high dimensional distance matrix, the as.dist function do not compute efficient distances. Therefore, I do not get the optimum tour for the TSP. For this purpose, I tried to manipulate the distance matrix. Here is what I did: First I created the distance matrix using as.dist function. Then I manually created another distance matrix and replaced its entries with the entries of original distance matrix. In this way, the curse of dimenionality problem is solved.

However, when I try to run the code on Concorde, It sticks again (same as it does before).

Here is the code

edgelist <- structure(list(V1 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L, 14L, 14L, 14L, 15L, 15L, 15L, 16L, 16L, 17L, 17L, 18L, 19L, 19L, 20L, 20L, 21L, 22L, 22L, 22L, 23L, 23L, 24L, 24L, 24L, 25L, 25L, 26L, 26L, 26L, 27L, 27L, 28L, 28L, 28L, 29L, 29L, 29L, 30L, 30L, 32L, 32L, 33L, 33L, 33L, 34L, 34L, 37L, 38L, 38L, 39L, 40L, 41L, 41L, 42L, 43L, 46L, 47L, 48L, 48L, 49L, 53L, 54L, 58L, 64L), V2 = c(3L, 9L, 61L, 17L, 31L, 51L, 40L, 46L, 42L, 47L, 63L, 8L, 18L, 39L, 30L, 40L, 62L, 13L, 31L, 58L, 50L, 63L, 25L, 35L, 16L, 27L, 44L, 19L, 45L, 54L, 21L, 47L, 55L, 51L, 66L, 35L, 57L, 61L, 18L, 20L, 63L, 52L, 53L, 21L, 37L, 23L, 52L, 56L, 32L, 57L, 34L, 38L, 44L, 60L, 49L, 57L, 36L, 56L, 62L, 36L, 46L, 43L, 60L, 64L, 60L, 65L, 44L, 45L, 52L, 31L, 37L, 41L, 54L, 56L, 35L, 36L, 43L, 48L, 66L, 39L, 50L, 55L, 45L, 59L, 49L, 59L, 42L, 58L, 55L, 65L, 62L, 50L, 51L, 53L, 61L, 65L, 59L, 64L, 66L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79", "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90", "91", "92", "93", "94", "95", "96", "97", "98", "99"))

edgelist_2 <- structure(list(V1 = c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L), V2 = c(4L, 5L, 3L, 4L, 5L, 2L, 6L, 1L, 2L, 5L, 1L, 2L, 4L, 6L, 3L, 4L, 5L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17"))

final_edgelist = edgelist

g <- graph_from_edgelist(as.matrix(final_edgelist)) plot(g)

adj_mat = as_adjacency_matrix(g, sparse = FALSE) adj_mat_2 = adj_mat

nodelist = unique(as.vector(as.matrix(final_edgelist))) adj_mat_sym = matrix(0,length(unique(nodelist)),length(unique(nodelist))) for(i in 1:length(final_edgelist[,1])){ adj_mat_sym[final_edgelist[i,1],final_edgelist[i,2]] = 1 adj_mat_sym[final_edgelist[i,2],final_edgelist[i,1]] = 1 }

M = adj_mat_sym lower_triangle = M[lower.tri(M)] lower_triangle_2 = lower_triangle lower_triangle_2[lower_triangle_2==0] = 2 lower_triangle_2[lower_triangle_2==1] = 0 lower_triangle_2[lower_triangle_2==2] = 100

d <- as.dist(1/(1+adj_mat_2)) d_5 = d d_5[1:length(d_5)] = lower_triangle_2[1:length(lower_triangle_2)]

tsp <- TSP(d_5) tsp_check <- TSP(d)

o2 <- solve_TSP(tsp) o2 as.integer(o2)

o2 <- solve_TSP(tsp, method="concorde",rep=10, control = list(clo = "-V")) o2 as.integer(o2)

mhahsler commented 4 years ago

I found out that igraph needs directed = FALSE. This should resolve the problem. However, Concorde is numerically instable with this type of data and that is why it fails.

# create graph
g <- graph_from_edgelist(as.matrix(edgelist), directed = FALSE)
plot(g)

# convert graph into a distance matrix
d <- as.dist((1/as_adjacency_matrix(g, sparse = FALSE))-1)

sarwanpasha commented 4 years ago

So basically you want to say that the distance matrix has to be symmetric?

sarwanpasha commented 4 years ago

I have run the code with your suggested changes. It is still not giving the optimum tour.

sarwanpasha commented 4 years ago

Can you explain why you added -1 in the following line instead of +1, which you suggested before? d <- as.dist((1/as_adjacency_matrix(g, sparse = FALSE))-1)

sarwanpasha commented 4 years ago

ok I found my mistake. It is working fine now.

Here is my final working code in case anyone want to check it

dataset_path = "E:/RA/Pablo Moscato/dataset/FHCPCS/"

name = "graph1.txt" #Found

name = "graph2.txt" #Found

name = "graph3.txt"

name = "graph4.txt"

name = "graph5.txt"

name = "graph6.txt" #Found

name = "graph7.txt"

name = "graph8.txt"

name = "graph9.txt"

name = "graph10.txt"

name = "graph13.txt"

name = "graph19.txt"

name = "graph40.txt"

dataset = read.table(paste(dataset_path,name,sep = ""))

main_function(dataset)

create graph

g <- graph_from_edgelist(as.matrix(dataset))

plot(g)

final_edgelist = dataset

g <- graph_from_edgelist(as.matrix(final_edgelist))

plot(g)

convert graph into a distance matrix

adj_mat = as_adjacency_matrix(g, sparse = FALSE) adj_mat_2 = adj_mat

nodelist = unique(as.vector(as.matrix(final_edgelist))) adj_mat_sym = matrix(0,length(unique(nodelist)),length(unique(nodelist))) for(i in 1:length(final_edgelist[,1])){ adj_mat_sym[final_edgelist[i,1],final_edgelist[i,2]] = 1 adj_mat_sym[final_edgelist[i,2],final_edgelist[i,1]] = 1 }

M = adj_mat_sym lower_triangle = M[lower.tri(M)] lower_triangle_2 = lower_triangle lower_triangle_2[lower_triangle_2==0] = 2 lower_triangle_2[lower_triangle_2==1] = 0 lower_triangle_2[lower_triangle_2==2] = 100

d <- as.dist(1/(1+adj_mat_2)) d_5 = d d_5[1:length(d_5)] = lower_triangle_2[1:length(lower_triangle_2)] d_6 <- (1/(1+d_5))

d_7 = d_6 d_7[d_7==1] = 0

tsp <- TSP(d_6)

tsp <- TSP(d_7)

tsp_check <- TSP(d)

o2 <- solve_TSP(tsp)

o2

as.integer(o2)

o2 <- solve_TSP(tsp, method="concorde",rep=10, control = list(clo = "-V"))

o2 <- solve_TSP(tsp_check, method="concorde",rep=10, control = list(clo = "-V"))

o2 as.integer(o2)

mhahsler commented 4 years ago

So basically you want to say that the distance matrix has to be symmetric?

No. I thought you want it to be symmetric. If you want an ATSP then the code would look like:

# create graph
g <- graph_from_edgelist(as.matrix(edgelist), directed = TRUE)
plot(g)

# convert graph into a distance matrix
d <- (1/as_adjacency_matrix(g, sparse = FALSE))-1

atsp <- ATSP(d)

I think Concorde will still be unstable with this. You might want to bring that up with the Concorde developers.

sarwanpasha commented 4 years ago

This is not my main point. There are two ways to solve Concorde TSP while giving edgelist as input. We know that graphs are usually not symmetric. Hence we can either assume that the edges are not directed and make symmetric graph, or we assume that edges are directed and solve it using ATSP (using asymmetric graph). In case of ATSP, the only difference is that we have to convert the ATSP to TSP using "reformulate_ATSP_as_TSP" function. In this case, the only difference is that the number of nodes become double as edges are directed.

We can take either assumption and solve it accordingly. Currently, I am taking the assumption that edges are undirected.

sarwanpasha commented 4 years ago

Concorde is now working perfectly fine with me using the code I pasted above

mhahsler commented 4 years ago

I have just uploaded a fix to GitHub. It should avoid the overflow for Concorde for your data. Please install the version from GitHub and test. Hopefully, this fixes the issues. Also, solve_TSP with method Concorde now accepts ATSPs and you do not need to used reformulate_ATSP_as_TSP anymore. Let me know if everything works now as expected and I will close this issue and prepare a CRAN release. Thanks, Michael.

sarwanpasha commented 4 years ago

When I try to install it through github, I get following error

install_github("mhahsler/TSP") Downloading GitHub repo mhahsler/TSP@master Error: Could not find tools necessary to compile a package In addition: Warning messages: 1: In untar2(tarfile, files, list, exdir) : skipping pax global extended headers 2: In untar2(tarfile, files, list, exdir) : skipping pax global extended headers

Can you please guide me that what is wrong here?

mhahsler commented 4 years ago

You need a working compiler installation. I have submitted the updates to CRAN and you should be able to install the new version from there.

sarwanpasha commented 4 years ago

You can close this issue now. The code is working fine as of now. Thanks

mhahsler / TSP

TSP not giving optimum Tour #14

name = "graph1.txt" #Found

name = "graph2.txt" #Found

name = "graph4.txt"

name = "graph5.txt"

name = "graph6.txt" #Found

name = "graph7.txt"

name = "graph8.txt"

name = "graph9.txt"

name = "graph10.txt"

name = "graph13.txt"

name = "graph19.txt"

name = "graph40.txt"

main_function(dataset)

create graph

g <- graph_from_edgelist(as.matrix(dataset))

plot(g)

plot(g)

convert graph into a distance matrix

tsp <- TSP(d_6)

tsp_check <- TSP(d)

o2 <- solve_TSP(tsp)

o2

as.integer(o2)

o2 <- solve_TSP(tsp_check, method="concorde",rep=10, control = list(clo = "-V"))