The current implementation of the Shortest Path Kernel lies in the function CalculateShortestPathKernel:
CalculateShortestPathKernel <- function(G) {
G.floyd <- as.list(rep(NA, length(G)))
for (i in 1:length(G)) {
D <- distances(G[[i]])
G.floyd[[i]] <- make_full_graph(vcount(G[[i]])) %>% set_edge_attr("weight", value = D[lower.tri(D)])
}
CalculateKStepRandomWalkKernel(G, c(0, 1))
}
This implementation is not faithful to the kernel described in Borgwardt et al. (2005) for 2 reasons:
Firstly, G.floyd should be passed to CalculateKStepRandomWalkKernel instead of G (the random walk kernel must be applied to the floyd transformations of the graphs).
Secondly, I think that in the Floyd transform, the nodes are connected only if there exist a path between them in the original graph. Indeed, in quote Borgwardt et al. (2005) in the section 4.1 : "A shortest-paths graph S contains the same set of nodes as the input graph I. Unlike in the input graph, there exists an edge between all nodes in S which are connected by a walk in I.". However, in your code, you use make_full_graph, which put an edge between all pairs of nodes, even those who are not connected by a walk. But, may be I miss-understand the paper...
I propose a corrected version:
Corrected_CalculateShortestPathKernel <- function(G) {
G.floyd <- as.list(rep(NA, length(G)))
for (i in 1:length(G)) {
D <- distances(G[[i]])
FloydAdj = matrix(1. * (is.finite(as.vector(D))),dim(D)[1],dim(D)[1])
G.floyd[[i]] = graph_from_adjacency_matrix(FloydAdj,mode="directed")
G.floyd[[i]] = set_edge_attr(G.floyd[[i]] , name = "weight" , value = as.vector(t(D))[is.finite(as.vector(t(D)))] )
}
CalculateKStepRandomWalkKernel(G.floyd, c(0, 1))
}
Hello,
The current implementation of the Shortest Path Kernel lies in the function CalculateShortestPathKernel:
This implementation is not faithful to the kernel described in Borgwardt et al. (2005) for 2 reasons:
Firstly, G.floyd should be passed to CalculateKStepRandomWalkKernel instead of G (the random walk kernel must be applied to the floyd transformations of the graphs).
Secondly, I think that in the Floyd transform, the nodes are connected only if there exist a path between them in the original graph. Indeed, in quote Borgwardt et al. (2005) in the section 4.1 : "A shortest-paths graph S contains the same set of nodes as the input graph I. Unlike in the input graph, there exists an edge between all nodes in S which are connected by a walk in I.". However, in your code, you use make_full_graph, which put an edge between all pairs of nodes, even those who are not connected by a walk. But, may be I miss-understand the paper...
I propose a corrected version:
PS: thank you for this useful package anyway =)