Trackage / pathos

0 stars 0 forks source link

find paths #2

Open mdsumner opened 7 years ago

mdsumner commented 7 years ago

find_paths has a reasonable stab at building paths from a set of segments. It's slow, but a doomed attempt at vectorization was exciting until it became clear that it worked only because the test set was actually ordered ... so, yeah.

C++ would speed up "cycles()", but it's not really all that slow it's just icky. Test on a really big data set.

Finding paths from segments is good for "removing internal boundaries", that simple features doesn't like, and for being able to do segmenty things with 1D topologies.

mdsumner commented 7 years ago

Package datastructures might help

mdsumner commented 7 years ago

Seems to:

## try using hash to insert the result as we go (or an env?)
hash_scan <- function(edges) {
  seen <- rep(FALSE, nrow(edges))
  v <- edges[[1]]
  j <- 0
  cycles <- list()
  while(!all(seen)) {
    f <- hashmap("character", "character")
    f <- insert(f, edges[[1]][!seen], edges[[2]][!seen])

   i <- 0
   l <- list()
   this <- start <- edges[[1]][!seen][1]
   repeat({
    i <- i + 1
    this <- get(f, this)
    l[[i]] <- this
    #print(i)
    if (this == start) break; 
   })
   seen[v %in% unlist(l)] <- TRUE
   j <- j + 1
   cycles[[j]] <- unlist(l)

  }
 bind_rows(lapply(cycles, function(x) tibble(index = x)), .id = "path")

}
mdsumner commented 7 years ago

This should work as pointed out by @mpadge

ggm::fundCycles(igraph::as_adj(igraph::graph_from_data_frame(edges)))

But, as_adj returns a sparse matrix and ggm wants an full matrix, so this would be prohibitive. Perhaps the ggm logic can be applied to the sparse graph?

mpadge commented 7 years ago

@mdsumner i've actually got a find paths routine buried amongst lots of code here and here - crib away to your heart's content. My use-case is a series of user-specified segments hypothesised to contain a closed path, so I first find the longest path that can connect all segments, then find the shortest path through that sequence. You likely only want the latter, and that's the easy bit, and is mostly just the connect_highways() function.

mdsumner commented 7 years ago

Nice! That's a good idea, I'll explore those. I'm messing around with Atlantis model geometry at the moment, it's a doubly-connected-edge-list, where the faces (segments) and the boxes (the polygons) are both first class entities, and they need to be interpreted with extension into depth independently as "ribbons" and "volumes", primarily for flux across faces (e.g. ocean current transport between boxes) as well as mass box properties (temperature, salinity). The exact same path/segment flip is required since the entities use a single "vertex pool", and this is directly relevant to animal tracking, integration with remote sensing and ocean/atmosphere models. At root it's a discrete vs. continuous distinction, basically requiring identity and structure at "sub-feature" levels. (QGIS calls this "geometry generators").

I hope together we can help identify some of these common patterns and get them appropriately elevated into the basic toolkit.