Open adamakhtar opened 12 years ago
So on the console you start your cuba app with
rackup config.ru
which requires hello.rb.
Cuba.define is called with a block. If we go take a look at the define method we see
def self.define(&block)
app.run new(&block)
end
where app is a method
def self.app
@app ||= Rack::Builder.new
end
assigning a Rack::Builder object to @app.
Recap of Rack - Whats Rack::Builder ?
Rack::Builder implements a small DSL to iteratively construct Rack applications.
it provides three instance methods to construct a rack app - run, use and map.
run allows you to specify your app which will be wrapped by Rack::Builder use allows you to stack middleware on top the app you specified with run map allows to map various apps and middlewares to specific paths.
@cyx provided us all a link to an (article)[http://cyrildavid.com/articles/2012/02/16/composing-apps] and gave an example of how to use map to compose an app made up of several other apps.
map "/blog" do
run Blog
end
map "/support" do
run Support
end
map "/docs" do
run Docs
end
Here we see we use Rack::Builder to create a website for a company split up into 3 seperate apps, each for blog, docs and support.
Note that Rack::Builder.run { blah } doesnt actually run a server or anything (right?) it simply wraps the passed code. To get the servers started you would do something like this
Rack::Handler::Mongrel.run Rack::Builder.run Proc.new{ [response, header, body] }, :port => 9879
heres an example of Rack::Builder in use from the (Rack docs) [http://rack.rubyforge.org/doc/classes/Rack/Builder.html]
app = Rack::Builder.new {
use Rack::CommonLogger
use Rack::ShowExceptions
map "/lobster" do
use Rack::Lint
run Rack::Lobster.new
end
}
Ok continuing
so we got ourselves as yet to be configured Rack::Builder object.
So back in
def self.define(&block)
app.run new(&block)
end
this is where we pass our app to rackbuilder.
First Cuba creates an instance of itself passing the block we defined in hello.rb (see below)
Cuba.define do
on get do
on "hello" do
res.write "Hello world!"
end
end
end
in Cuba#initialize it simply stores that block in an instance variable @blk.
def initialize(&blk)
@blk = blk
@captures = []
end
So now rack builder has a Cuba object. But for it to be valid rack app it must have a call method. Which it does!
def call(env)
dup.call!(env)
end
So in summary,
requiring 'hello world' results in an instance of a Cuba being created with the contents of the define method being stored away in it.
So on run Cuba I guess our app is started and simply waiting for a request to come in.
When it does hit the server, rack handles it and calls our app passing the environment hash.
So assuming a request for "/hello"
our Cuba#call is called which immediately duplicates itself (self being a cuba instance) and calls .call!
Im not sure why its being used here but usually I use dup when I want to make preserve the values of an object.
copy_obj = orignal_object.dup
a_method_which_could_change_stuff_in(copy_obj)
#original_object is still the way i want it!
anyway onwards!
def call(env)
dup.call!(env)
end
def call!(env)
@env = env
@req = Rack::Request.new(env)
@res = Cuba::Response.new
# This `catch` statement will either receive a
# rack response tuple via a `halt`, or will
# fall back to issuing a 404.
#
# When it `catch`es a throw, the return value
# of this whole `_call` method will be the
# rack response tuple, which is exactly what we want.
catch(:halt) do
instance_eval(&@blk)
res.status = 404
res.finish
end
end
here we see call! stores the env, creates a Rack::Request object(a handy helper from rack which provides some methods such as get? post?), and Cuba::Response object.
Then on to the meat and potatoes.
We start with catch(:halt). If you are not familiar with catch and its counterpart throw its similiar to raise and rescue but whereas the later are used for error situations, the former are used to simply controll the flow of a program.
@cyx mentioned it's kind of similiar to a goto statement. A short excellent explanation can be found here by Avdi Grimm at rubylearning.
theres no sign of its friend throw here but we'll be meeting it shortly. In the meantime catch sits waiting for it whilst the contents of the block it is passed are executed.
And its finally here where the DSL we used to create our app gets executed with
instance_eval(&@blk)
on get do
on "hello" do
res.write "Hello world!"
end
end
so heres the on method.
def on(*args, &block)
try do
@captures = []
return unless args.all? { |arg| match(arg) }
yield(*captures)
halt res.finish
end
immediately we see a call to Cuba#try (not the rails#try). Now I dont fully understand this method but I know its something to help allow mapping stuff to paths ( I think? ) - Id love it if someone could do an explanation of how that works. For our simple app I dont think it does anything we need to care about. So Im assuming that the contents of the block passed to it are executed.
We get an @captures set to []
Now we come to
return unless args.all? { |arg| match(arg) }
Well what our are args?
on get do...
so we have one... get. Its not a string or symbol but actually the result of a method called get.
def get; req.get? end
get simply returns true if the request from the client was a get request. In our case it was so the boolean true is returned to "on".
So this args.all? { |arg| match(arg) }
looks like this [true].all? { |arg| match(arg) }
So what does match(true)
return?
def match(matcher, segment = "([^\\/]+)")
case matcher
when String then consume(matcher.gsub(/:\w+/, segment))
when Regexp then consume(matcher)
when Symbol then consume(segment)
when Proc then matcher.call
else
matcher
end
end
in this case true
is none of the four types in the case expression so match simply returns true
back
return unless args.all? { |arg| match(arg) }
this means the above return
isnt executed and instead skipped.
Phew.
So next is
yield(*captures)
First @captures was unaltered in our case. If you look back at the case expression in #match
you`ll notice for those 4 types, #consume is called. In that method @capture can be modified. For now its blank.
we yield the inner block of our dsl
on "hello" do
res.write "Hello world!"
end
which again calls on....yup its one of those inception style things - but bear with me.
ignoring try again, we are asked if the args ("hello") match.
So back in match we see
when String then consume(........
to keep things short consume basically tests to see if "hello" matches the path from the request and if it does try and extract any params and store them in @captures ( Im not sure if captures is for params - just a hunch. )
The result of match is true so again return
isnt executed and we go back to
yield(*captures)
where finally
res.write "Hello world!"
is called, storing "hello world" in @res.
Unlike previous iterations the final block doesnt call the method "on"
yield(*captures) #we just finished here
halt res.finish
end
end
so we finally call halt res.finish
where res.finish returns
def finish
[@status, @headers, @body]
end
and is passed to halt as a response
def halt(response)
throw :halt, response
end
And here we finally get to meet the previously mentioned catch's counterpart - throw and it's payload the response.
We our immediatley taken out of this inception headbanger back to the catch in the method call!(env)
def call!(env)
...
# This `catch` statement will either receive a
# rack response tuple via a `halt`, or will
# fall back to issuing a 404.
catch(:halt) do
instance_eval(&@blk)
res.status = 404
res.finish
end
end
catch recieves our throw which was buried deep in the bowels of that instance_eval and returns it, without executing the code involving res.status = 404.
So call returns the response as rack expects .
In summary the DSL appears to act like a huge case statement.
It compares every 'on' expression with the request and if it doesn't match moves on to the next sibling 'on'. However, if it's true it descends down into the matched on via its block and continues to check if any of the nested 'on' expressions match the request. At some point we end up with a block of code intended to be the result.
The reason why we have the catch and throw system is to avoid unnecessary processing I guess. Image if it was like this
on get "/hello" do #two arguments here for brevity sake but perfectly valid in cuba. do_something end
on post "blah" do do_something_else end
on get "blahblah" do do_something_else! end
If the request was a get to "/hello" then Cuba would have found the correct code to run on the first on. Without catch and throw though, it would unnecessarily try to check all the other actions declared.
Im not sure if thats a massive performance penalty but still, it's nice to be lean.
Maybe someone can confirm this as being the primary reason or just a nice side effect.
Right that's me done.
Somethings id like to know are what are @captures and the reason for the #try method.
wow, @robodisco, nice job.
re. #try
, see the point cyx makes in #3. The way nested on
calls work, at least for the regexp/string/symbol matchers, is to progressively 'eat' the URL by shifting the matching piece from PATH_INFO
into SCRIPT_NAME
, and leaving the remainder in PATH_INFO
. That way, a route nested inside another matches on the remainder of the path, rather than the whole path. So effectively when you have something like
# GET /users/1/posts
on 'users/:id' do
# Here, PATH_INFO = "/posts", rather than "/users/1/posts",
# so that the following route matches properly
on 'posts' do
# ...
end
end
But the downside to mutating the path like this, is that you would be left with inconsistent state after the nested route finished, if it didn't get reset afterwards. I can't think of a great example, but say you have some Rack middleware that comes into play after your app finishes, and needs to access PATH_INFO
. Then you want that to be reset to the full path, not the last matching piece of the path. So this is the function of #try
: by wrapping your route handler in #try
, the path gets reset before finishing (halt
).
# @private Used internally by #on to ensure that SCRIPT_NAME and
# PATH_INFO are reset to their proper values.
def try
script, path = env["SCRIPT_NAME"], env["PATH_INFO"]
yield
ensure
env["SCRIPT_NAME"], env["PATH_INFO"] = script, path
end
Hi Eric,
For your example, there should be no inconsistent state.
# GET /users/1/posts
on 'users/:id' do
# Here, PATH_INFO = "/posts", rather than "/users/1/posts",
# so that the following route matches properly
on "posts" do
# PATH_INFO=""
end
# PATH_INFO back to /posts
end
# PATH_INFO back to /users/1/posts
Thanks, cyx
But the downside to mutating the path like this, is you are left with inconsistent state after the nested route finishes. I can't think of a great example, but say you have some Rack middleware that comes into play after your app finishes, and needs to access
PATH_INFO
. Then you want that to be reset to the full path, not the last matching piece of the path. So this is the function of#try
: by wrapping your route handler in#try
, the path gets reset before finishing (halt
).# @private Used internally by #on to ensure that SCRIPT_NAME and # PATH_INFO are reset to their proper values. def try script, path = env["SCRIPT_NAME"], env["PATH_INFO"] yield ensure env["SCRIPT_NAME"], env["PATH_INFO"] = script, path end
Reply to this email directly or view it on GitHub: https://github.com/codereading/cuba/issues/4#issuecomment-7279457
I think the Readme does a good job explaining captures, but they are basically very similar to Sinatra's - they are pieces of the URL that get matched and then passed into your route handler as parameters.
Except that Cuba has two kinds of captures that Sinatra doesn't have -
on get, extension("css") do |basename| end
will give you basename = "example"
from the path GET /example.css
on post, "foo", param("a"), param("b"), param("c") do |a,b,c| end
will give you a, b, c = "1", "2", "3"
from the path POST /foo?a=1&b=2&c=3
(or likewise if the params come from a form). Basically saving you the work of assigning local variables for each param within your handler.(Note I haven't tested any of this, this is just going on what the Readme says and what it looks like in the code, the syntax may not be exactly right.)
@cyx, exactly, what I meant was (but wasn't very clear): if you didn't have the try
wrapper, you'd be left with inconsistent state. Thus the need for try
, which robodisco was asking about.
So it seems like the only bits of state that get mutated by the framework (and reset appropriately for nested routes), are the SCRIPT_NAME
and PATH_INFO
, and @captures
. Is that right?
thanks @cyx and @ericgj that helps out a lot. A few more things im not sure of but a rewalkthrough should clear things up.
One thing however, i see you mentioning the word 'state' a lot. Is this a cuba 'thing' or rack 'thing'. I feel a bit stupid asking :-) but as I said in the beginning - no such thing as a stupid question. What other constants like script_name are there reating to state?
Hi Erik,
Here are all the stuff manipulated:
req - when you change SCRIPT_NAME and PATH_INFO, it technically changes. res - when you write response
and yes @captures, but this is more internal state rather than something that the user should know.
@robodisco - it's more a Rack thing. More examples of state related to rack can be seen in middleware, which use env a lot, env["rack.session"] is the most common example, and last I checked the warden middleware uses env a lot too.
Thanks, cyx
Yes @captures, SCRIPT_NAME and PATH_INFO are the only ones manipulated. On Jul 26, 2012, at 11:50 PM, robodisco wrote:
thanks @cyx and @ericgj that helps out a lot. A few more things im not sure of but a rewalkthrough should clear things up.
One thing however, i see you mentioning the word 'state' a lot. Is this a cuba 'thing' or rack 'thing'. I feel a bit stupid asking :-) but as I said in the beginning - no such thing as a stupid question. What other constants like script_name are there reating to state?
Reply to this email directly or view it on GitHub: https://github.com/codereading/cuba/issues/4#issuecomment-7281555
Just realised I never mentioned everyone in this issue. Better late than never so....
oi @codereading/readers just done a walkthrough - come and check it out!
thanks @cyx!
Hi @robodisco,
Good eye regarding the throw as a performance improvement. It's not much, but it was something like a 2-3ms improvement, depending on the number of on
statements you have in your app (we tried with around 10 that time).
We used to do it differently and changed it somewhere along 2.x. Here's the sketch commit that I made 10 months ago: fe467d233b2cefdcec862f4adbf44a281605754f
Thanks, cyx
Aside from the performance improvement, it was also a refactoring, since it used to be that run
depended on a throw, and the normal flow didn't. We kinda hit 2 birds with one stone by making throw :halt, tuple the defacto way to tell Cuba that, "ok we're done, here's the response". Overall we're still happy with it.
Ill give this a shot but instead of using the cuba-app example Im going to use a much more simple example such as this
and a corresponding config.ru