luislavena / radix

Radix Tree implementation for Crystal
MIT License
102 stars 14 forks source link

Skip named parameters when looking for key/path differences #4

Closed alsm closed 8 years ago

alsm commented 8 years ago

I'm fairly new at Crystal so this may not be the best way of doing this, but hopefully the intention is clear. The new test is taken from the example kamber config, without the change it fails to work correctly. The problem seems to be with splitting a named parameter when doing the string comparison for the radix tree.

luislavena commented 8 years ago

Hello @alsm thank you for your pull request! :heart: :heart: :heart:

If you don't mind, I would like to take a different approach to tackle this, which is caused by named segments being split when adding new a new path to the tree.

I'm adding a spec to the add method that covers this properly and will add the proper skip functionality.

Will ping you once is ready.

Thank you.

luislavena commented 8 years ago

Hello @alsm, also found that this change does not work for the following scenario:

tree = Radix::Tree.new
tree.add "/", :root
tree.add "/:foo", :foo
tree.add "/:bar", :bar

Both :foo and :bar should be children of root, but :bar ends being a children node of :foo.

I will take a better look at these new scenarios and get back to you.

Cheers.

alsm commented 8 years ago

Good catch, I forgot to make a change to the split point code just beneath that section. I've made this extra change for my own benefit, it got auto added to the PR. I'll be interested to see how you solve this.

luislavena commented 8 years ago

Hello @alsm, I've opened Issue #5 to describe all the issues that you've discovered.

Going to work on a proper solution to those scenarios but wanted to thank you for uncovering this first.

Thank you.

alsm commented 8 years ago

Sorry I keep pushing changes to this. This latest seems to resolve the issue you've identified in #5 with not placing child nodes correctly. Again this is a naive implementation that solves my immediate issue.

Happy to help :)

luislavena commented 8 years ago

@alsm I don't mind! I consider it good that you keep trying to workout this issue!

I was about to push my fix for the issue when noticed there is one extra challenge, with these changes in place, /:category/:post is now above /:post which causes #find not to work.

This is caused by Node's priority, but at the same time, is a bigger problem (bear with me while I go over the issue)

With current implementation, both /:category/:post and /:post set priority at 1, so Node#sort! might place them above or bellow each other.

Now, when walking the path during search, it works by finding the first branch of the nodes and when detected a named parameter, it will extract the value using the key in the node.

Then it will detect if there is more path to consume and if so, will lookup for the node that matches that.

tree = Radix::Tree.new
tree.add "/", :root
tree.add "/:post", :post
tree.add "/:category/:post", :category_post
# /
#  :category/:post
#  :post

tree.root.sort!
# /
#  :post
#  :category/:post

result = tree.find "/example"
pp result.found? # => false
pp result.params # => {"post" => "example"}

result = tree.find "/news/first-post"
pp result.found? # => false
pp result.params # => {"post" => "news"}

At this time, there is no easy solution for this, at least not on a Radix implementation.

I'm thinking raise an error for that scenario since will make hard walk the path without look ahead checks.

luislavena commented 8 years ago

Hello @alsm, I wanted to let you know that I've pushed my correction as e0ef8d83da85b475a0a104c35176d077b62d6d0e.

While the commit message covers pretty much what we discussed here, wanted to explain the reason on why I went with that approach instead of your pull request and also provide some comments that hopefully will help you with Crystal in the future.

The following is the list of things that I wanted to point out:

The first two points are present in extract_key method. Making copies of strings might result in extra cost of the operation (and memory), but even more is looping over each character to write it down.

Imagine this done against a 290+ routes tree and the number of temporary strings grows considerably.

To avoid iterating over each individual character, I use byte_slice which returns a new String using a given position, and only do this when the key is going to be stored/used for Node's key or inside params (see Tree#find)

I decided to defer as much as possible any new object instance until was really necessary (see Char::Reader being instantiated only when both child's key and new node key are named segments.

For third point, took the same reasoning: defer as much as possible call another method. I could have added @[AlwaysInline] to _same_key? method which will ensure that LLVM optimization will inline it inside add reducing the switching.

Please note that I don't think your implementation was bad, but only that I wanted to tackle this with the same approach taken for the rest of the codebase.

Hope you can take a look to the code and apply these techniques for your own gain.

Let me know if you have questions.

Cheers!

alsm commented 8 years ago

Thanks for the explanation, I knew my implementation would be naive and non optimal, interesting to see how you resolved it and where mine was deficient.