Closed alsm closed 8 years ago
Hello @alsm thank you for your pull request! :heart: :heart: :heart:
If you don't mind, I would like to take a different approach to tackle this, which is caused by named segments being split when adding new a new path
to the tree.
I'm adding a spec to the add
method that covers this properly and will add the proper skip
functionality.
Will ping you once is ready.
Thank you.
Hello @alsm, also found that this change does not work for the following scenario:
tree = Radix::Tree.new
tree.add "/", :root
tree.add "/:foo", :foo
tree.add "/:bar", :bar
Both :foo
and :bar
should be children
of root
, but :bar
ends being a children node of :foo
.
I will take a better look at these new scenarios and get back to you.
Cheers.
Good catch, I forgot to make a change to the split point code just beneath that section. I've made this extra change for my own benefit, it got auto added to the PR. I'll be interested to see how you solve this.
Hello @alsm, I've opened Issue #5 to describe all the issues that you've discovered.
Going to work on a proper solution to those scenarios but wanted to thank you for uncovering this first.
Thank you.
Sorry I keep pushing changes to this. This latest seems to resolve the issue you've identified in #5 with not placing child nodes correctly. Again this is a naive implementation that solves my immediate issue.
Happy to help :)
@alsm I don't mind! I consider it good that you keep trying to workout this issue!
I was about to push my fix for the issue when noticed there is one extra challenge, with these changes in place, /:category/:post
is now above /:post
which causes #find
not to work.
This is caused by Node's priority
, but at the same time, is a bigger problem (bear with me while I go over the issue)
With current implementation, both /:category/:post
and /:post
set priority
at 1
, so Node#sort!
might place them above or bellow each other.
Now, when walking the path during search, it works by finding the first branch of the nodes and when detected a named parameter, it will extract the value using the key in the node.
Then it will detect if there is more path to consume and if so, will lookup for the node that matches that.
tree = Radix::Tree.new
tree.add "/", :root
tree.add "/:post", :post
tree.add "/:category/:post", :category_post
# /
# :category/:post
# :post
tree.root.sort!
# /
# :post
# :category/:post
result = tree.find "/example"
pp result.found? # => false
pp result.params # => {"post" => "example"}
result = tree.find "/news/first-post"
pp result.found? # => false
pp result.params # => {"post" => "news"}
At this time, there is no easy solution for this, at least not on a Radix implementation.
I'm thinking raise an error for that scenario since will make hard walk the path without look ahead checks.
Hello @alsm, I wanted to let you know that I've pushed my correction as e0ef8d83da85b475a0a104c35176d077b62d6d0e.
While the commit message covers pretty much what we discussed here, wanted to explain the reason on why I went with that approach instead of your pull request and also provide some comments that hopefully will help you with Crystal in the future.
The following is the list of things that I wanted to point out:
The first two points are present in extract_key
method. Making copies of strings might result in extra cost of the operation (and memory), but even more is looping over each character to write it down.
Imagine this done against a 290+ routes tree and the number of temporary strings grows considerably.
To avoid iterating over each individual character, I use byte_slice
which returns a new String
using a given position, and only do this when the key is going to be stored/used for Node's key
or inside params (see Tree#find
)
I decided to defer as much as possible any new object instance until was really necessary (see Char::Reader
being instantiated only when both child's key and new node key are named segments.
For third point, took the same reasoning: defer as much as possible call another method. I could have added @[AlwaysInline]
to _same_key?
method which will ensure that LLVM optimization will inline it inside add
reducing the switching.
Please note that I don't think your implementation was bad, but only that I wanted to tackle this with the same approach taken for the rest of the codebase.
Hope you can take a look to the code and apply these techniques for your own gain.
Let me know if you have questions.
Cheers!
Thanks for the explanation, I knew my implementation would be naive and non optimal, interesting to see how you resolved it and where mine was deficient.
I'm fairly new at Crystal so this may not be the best way of doing this, but hopefully the intention is clear. The new test is taken from the example kamber config, without the change it fails to work correctly. The problem seems to be with splitting a named parameter when doing the string comparison for the radix tree.