Error in CalculateShortestPathKernel function

Hello,

The current implementation of the Shortest Path Kernel lies in the function CalculateShortestPathKernel:

CalculateShortestPathKernel <- function(G) {
  G.floyd <- as.list(rep(NA, length(G)))
  for (i in 1:length(G)) {
    D <- distances(G[[i]])
    G.floyd[[i]] <- make_full_graph(vcount(G[[i]])) %>% set_edge_attr("weight", value = D[lower.tri(D)])
  }
  CalculateKStepRandomWalkKernel(G, c(0, 1))
}

This implementation is not faithful to the kernel described in Borgwardt et al. (2005) for 2 reasons:

Firstly, G.floyd should be passed to CalculateKStepRandomWalkKernel instead of G (the random walk kernel must be applied to the floyd transformations of the graphs).

Secondly, I think that in the Floyd transform, the nodes are connected only if there exist a path between them in the original graph. Indeed, in quote Borgwardt et al. (2005) in the section 4.1 : "A shortest-paths graph S contains the same set of nodes as the input graph I. Unlike in the input graph, there exists an edge between all nodes in S which are connected by a walk in I.". However, in your code, you use make_full_graph, which put an edge between all pairs of nodes, even those who are not connected by a walk. But, may be I miss-understand the paper...

I propose a corrected version:

Corrected_CalculateShortestPathKernel <- function(G) {
    G.floyd <- as.list(rep(NA, length(G)))
  for (i in 1:length(G)) {
    D <- distances(G[[i]])
    FloydAdj = matrix(1. * (is.finite(as.vector(D))),dim(D)[1],dim(D)[1])
    G.floyd[[i]] = graph_from_adjacency_matrix(FloydAdj,mode="directed")
    G.floyd[[i]] = set_edge_attr(G.floyd[[i]] , name = "weight" , value = as.vector(t(D))[is.finite(as.vector(t(D)))] )
  }
  CalculateKStepRandomWalkKernel(G.floyd, c(0, 1))
}

PS: thank you for this useful package anyway =)

cran / graphkernels

Error in CalculateShortestPathKernel function #1