rklaehn / radixtree

A fast and generic immutable radix tree for scala
Apache License 2.0
96 stars 14 forks source link

Prefixes of #1

Closed czcollier closed 8 years ago

czcollier commented 8 years ago

I have added a filterPrefixesOf method (and associated test) for searching the tree for all prefixes of a search key. Different use case than filterPrefix, but efficient for the problem in that it makes a traversal down to the longest node that matches then returns that path once a non-match is encountered. See git commit message for more comments.

I needed this for some software I am building that needs to quickly find, from a collection of URI paths, all paths that are prefixes of a search path. This served the purpose well.

Hopefully this addition is something you find would be generally useful.

codecov-io commented 8 years ago

Current coverage is 91.37%

Merging #1 into master will increase coverage by +0.44% as of b3f99da

@@            master      #1   diff @@
======================================
  Files            7       7       
  Stmts          386     406    +20
  Branches       108     118    +10
  Methods          0       0       
======================================
+ Hit            351     371    +20
  Partial          0       0       
  Missed          35      35       

Review entire Coverage Diff as of b3f99da

Powered by Codecov. Updated on successful CI builds.

rklaehn commented 8 years ago

Thanks a lot. I had something similar a while ago, and wasn't sure if it was generally useful enough. If you need it too, I guess this is evidence that it is.

rklaehn commented 8 years ago

Looks good. Not sure what you mean by the lack of path compression though. RadixTree is not a case class. The copy method will ensure that the invariants are upheld, so a redundant node with just one child and no value will not be kept.

E.g.

scala> import com.rklaehn.radixtree._
scala> val x = RadixTree("aa" -> 1, "ab" -> 1, "ac" -> 1).filterPrefixesOf("abc")
scala> x.printStructure
res0: String = RadixTree(ab, Opt(1), [])
czcollier commented 8 years ago

Great, I'm glad my addition was helpful.

Regarding the path compression, I may have given it the wrong name, but yes removing the nodes with no value is what I was referring to. It was confusing to me at first because inside filterPrefix, it empties the values of nodes when it copies them, so (I think) some nodes get merged/removed, but it is OK because, as you said, the invariants are upheld in that case.

But, filterPrefixesOf did not work (returned only a single node) when I set the node values to empty on copying (like in filterPrefix), but works correctly when I don't empty the node values.

Anyway, thanks for being responsive on the pull request!

Chris