Walkthrough #4

Open adamakhtar opened 12 years ago

adamakhtar commented 12 years ago

Ill give this a shot but instead of using the cuba-app example Im going to use a much more simple example such as this

### hello.rb

require "cuba"

Cuba.define do
  on get do
    on "hello" do
      res.write "Hello world!"

and a corresponding

# cat
require "./hello_world"

run Cuba
adamakhtar commented 12 years ago

So on the console you start your cuba app with


which requires hello.rb.

Cuba.define is called with a block. If we go take a look at the define method we see

def self.define(&block) new(&block)

where app is a method

    @app ||=

assigning a Rack::Builder object to @app.

Recap of Rack - Whats Rack::Builder ?

Rack::Builder implements a small DSL to iteratively construct Rack applications.

it provides three instance methods to construct a rack app - run, use and map.

run allows you to specify your app which will be wrapped by Rack::Builder use allows you to stack middleware on top the app you specified with run map allows to map various apps and middlewares to specific paths.

@cyx provided us all a link to an (article)[] and gave an example of how to use map to compose an app made up of several other apps.

map "/blog" do
  run Blog

map "/support" do
  run Support

map "/docs" do
  run Docs

Here we see we use Rack::Builder to create a website for a company split up into 3 seperate apps, each for blog, docs and support.

Note that { blah } doesnt actually run a server or anything (right?) it simply wraps the passed code. To get the servers started you would do something like this{ [response, header, body] }, :port => 9879

heres an example of Rack::Builder in use from the (Rack docs) []

app = {
   use Rack::CommonLogger
   use Rack::ShowExceptions
   map "/lobster" do
     use Rack::Lint

Ok continuing

adamakhtar commented 12 years ago

so we got ourselves as yet to be configured Rack::Builder object.

So back in

  def self.define(&block) new(&block)

this is where we pass our app to rackbuilder.

First Cuba creates an instance of itself passing the block we defined in hello.rb (see below)

Cuba.define do
  on get do
    on "hello" do
      res.write "Hello world!"

in Cuba#initialize it simply stores that block in an instance variable @blk.

 def initialize(&blk)
    @blk = blk
    @captures = []

So now rack builder has a Cuba object. But for it to be valid rack app it must have a call method. Which it does!

def call(env)!(env)
adamakhtar commented 12 years ago

So in summary,

requiring 'hello world' results in an instance of a Cuba being created with the contents of the define method being stored away in it.

So on run Cuba I guess our app is started and simply waiting for a request to come in.

When it does hit the server, rack handles it and calls our app passing the environment hash.

So assuming a request for "/hello"

our Cuba#call is called which immediately duplicates itself (self being a cuba instance) and calls .call!

Im not sure why its being used here but usually I use dup when I want to make preserve the values of an object.

copy_obj = orignal_object.dup


#original_object is still the way i want it!

anyway onwards!

  def call(env)!(env)

  def call!(env)
    @env = env
    @req =
    @res =

    # This `catch` statement will either receive a
    # rack response tuple via a `halt`, or will
    # fall back to issuing a 404.
    # When it `catch`es a throw, the return value
    # of this whole `_call` method will be the
    # rack response tuple, which is exactly what we want.
    catch(:halt) do

      res.status = 404

here we see call! stores the env, creates a Rack::Request object(a handy helper from rack which provides some methods such as get? post?), and Cuba::Response object.

Then on to the meat and potatoes.

We start with catch(:halt). If you are not familiar with catch and its counterpart throw its similiar to raise and rescue but whereas the later are used for error situations, the former are used to simply controll the flow of a program.

@cyx mentioned it's kind of similiar to a goto statement. A short excellent explanation can be found here by Avdi Grimm at rubylearning.

theres no sign of its friend throw here but we'll be meeting it shortly. In the meantime catch sits waiting for it whilst the contents of the block it is passed are executed.

And its finally here where the DSL we used to create our app gets executed with


adamakhtar commented 12 years ago
on get do
    on "hello" do
      res.write "Hello world!"

so heres the on method.

def on(*args, &block)
    try do

      @captures = []
      return unless args.all? { |arg| match(arg) }


      halt res.finish

immediately we see a call to Cuba#try (not the rails#try). Now I dont fully understand this method but I know its something to help allow mapping stuff to paths ( I think? ) - Id love it if someone could do an explanation of how that works. For our simple app I dont think it does anything we need to care about. So Im assuming that the contents of the block passed to it are executed.

We get an @captures set to []

Now we come to

return unless args.all? { |arg| match(arg) }

Well what our are args?

on get do...

so we have one... get. Its not a string or symbol but actually the result of a method called get.

def get; req.get? end

get simply returns true if the request from the client was a get request. In our case it was so the boolean true is returned to "on".

So this args.all? { |arg| match(arg) }

looks like this [true].all? { |arg| match(arg) }

So what does match(true) return?

  def match(matcher, segment = "([^\\/]+)")
    case matcher
    when String then consume(matcher.gsub(/:\w+/, segment))
    when Regexp then consume(matcher)
    when Symbol then consume(segment)
    when Proc   then

in this case true is none of the four types in the case expression so match simply returns true back

return unless args.all? { |arg| match(arg) }

this means the above return isnt executed and instead skipped.


So next is


First @captures was unaltered in our case. If you look back at the case expression in #match you`ll notice for those 4 types, #consume is called. In that method @capture can be modified. For now its blank.

we yield the inner block of our dsl

    on "hello" do
      res.write "Hello world!"

which again calls on....yup its one of those inception style things - but bear with me.

ignoring try again, we are asked if the args ("hello") match.

So back in match we see

when String then consume(........

to keep things short consume basically tests to see if "hello" matches the path from the request and if it does try and extract any params and store them in @captures ( Im not sure if captures is for params - just a hunch. )

The result of match is true so again return isnt executed and we go back to

yield(*captures) where finally

res.write "Hello world!" is called, storing "hello world" in @res.

Unlike previous iterations the final block doesnt call the method "on"

      yield(*captures) #we just finished here

      halt res.finish

so we finally call halt res.finish

where res.finish returns

def finish
      [@status, @headers, @body]

and is passed to halt as a response

def halt(response)
    throw :halt, response

And here we finally get to meet the previously mentioned catch's counterpart - throw and it's payload the response.

We our immediatley taken out of this inception headbanger back to the catch in the method call!(env)

def call!(env)

    # This `catch` statement will either receive a
    # rack response tuple via a `halt`, or will
    # fall back to issuing a 404.

    catch(:halt) do

      res.status = 404

catch recieves our throw which was buried deep in the bowels of that instance_eval and returns it, without executing the code involving res.status = 404.

So call returns the response as rack expects .

adamakhtar commented 12 years ago

In summary the DSL appears to act like a huge case statement.

It compares every 'on' expression with the request and if it doesn't match moves on to the next sibling 'on'. However, if it's true it descends down into the matched on via its block and continues to check if any of the nested 'on' expressions match the request. At some point we end up with a block of code intended to be the result.

The reason why we have the catch and throw system is to avoid unnecessary processing I guess. Image if it was like this

on get "/hello" do #two arguments here for brevity sake but perfectly valid in cuba. do_something end

on post "blah" do do_something_else end

on get "blahblah" do do_something_else! end

If the request was a get to "/hello" then Cuba would have found the correct code to run on the first on. Without catch and throw though, it would unnecessarily try to check all the other actions declared.

Im not sure if thats a massive performance penalty but still, it's nice to be lean.

Maybe someone can confirm this as being the primary reason or just a nice side effect.

Right that's me done.

Somethings id like to know are what are @captures and the reason for the #try method.

ericgj commented 12 years ago

wow, @robodisco, nice job.

re. #try, see the point cyx makes in #3. The way nested on calls work, at least for the regexp/string/symbol matchers, is to progressively 'eat' the URL by shifting the matching piece from PATH_INFO into SCRIPT_NAME, and leaving the remainder in PATH_INFO. That way, a route nested inside another matches on the remainder of the path, rather than the whole path. So effectively when you have something like

# GET /users/1/posts
on 'users/:id' do

  # Here, PATH_INFO = "/posts", rather than "/users/1/posts", 
  # so that the following route matches properly
  on 'posts' do
      # ...   


But the downside to mutating the path like this, is that you would be left with inconsistent state after the nested route finished, if it didn't get reset afterwards. I can't think of a great example, but say you have some Rack middleware that comes into play after your app finishes, and needs to access PATH_INFO. Then you want that to be reset to the full path, not the last matching piece of the path. So this is the function of #try: by wrapping your route handler in #try, the path gets reset before finishing (halt).

  # @private Used internally by #on to ensure that SCRIPT_NAME and
  #          PATH_INFO are reset to their proper values.
  def try
    script, path = env["SCRIPT_NAME"], env["PATH_INFO"]


    env["SCRIPT_NAME"], env["PATH_INFO"] = script, path
cyx commented 12 years ago

Hi Eric,

For your example, there should be no inconsistent state.

# GET /users/1/posts
on 'users/:id' do

  # Here, PATH_INFO = "/posts", rather than "/users/1/posts", 
  # so that the following route matches properly
  on "posts" do
     # PATH_INFO=""

  # PATH_INFO back to /posts

# PATH_INFO back to /users/1/posts

Thanks, cyx

But the downside to mutating the path like this, is you are left with inconsistent state after the nested route finishes. I can't think of a great example, but say you have some Rack middleware that comes into play after your app finishes, and needs to access PATH_INFO. Then you want that to be reset to the full path, not the last matching piece of the path. So this is the function of #try: by wrapping your route handler in #try, the path gets reset before finishing (halt).

 # @private Used internally by #on to ensure that SCRIPT_NAME and
 #          PATH_INFO are reset to their proper values.
 def try
   script, path = env["SCRIPT_NAME"], env["PATH_INFO"]


   env["SCRIPT_NAME"], env["PATH_INFO"] = script, path

ericgj commented 12 years ago

I think the Readme does a good job explaining captures, but they are basically very similar to Sinatra's - they are pieces of the URL that get matched and then passed into your route handler as parameters.

Except that Cuba has two kinds of captures that Sinatra doesn't have -

  1. on get, extension("css") do |basename| end will give you basename = "example" from the path GET /example.css
  2. on post, "foo", param("a"), param("b"), param("c") do |a,b,c| end will give you a, b, c = "1", "2", "3" from the path POST /foo?a=1&b=2&c=3 (or likewise if the params come from a form). Basically saving you the work of assigning local variables for each param within your handler.

(Note I haven't tested any of this, this is just going on what the Readme says and what it looks like in the code, the syntax may not be exactly right.)

ericgj commented 12 years ago

@cyx, exactly, what I meant was (but wasn't very clear): if you didn't have the try wrapper, you'd be left with inconsistent state. Thus the need for try, which robodisco was asking about.

So it seems like the only bits of state that get mutated by the framework (and reset appropriately for nested routes), are the SCRIPT_NAME and PATH_INFO, and @captures. Is that right?

adamakhtar commented 12 years ago

thanks @cyx and @ericgj that helps out a lot. A few more things im not sure of but a rewalkthrough should clear things up.

One thing however, i see you mentioning the word 'state' a lot. Is this a cuba 'thing' or rack 'thing'. I feel a bit stupid asking :-) but as I said in the beginning - no such thing as a stupid question. What other constants like script_name are there reating to state?

cyx commented 12 years ago

Hi Erik,

Here are all the stuff manipulated:

req - when you change SCRIPT_NAME and PATH_INFO, it technically changes. res - when you write response

and yes @captures, but this is more internal state rather than something that the user should know.

@robodisco - it's more a Rack thing. More examples of state related to rack can be seen in middleware, which use env a lot, env["rack.session"] is the most common example, and last I checked the warden middleware uses env a lot too.

Thanks, cyx

Yes @captures, SCRIPT_NAME and PATH_INFO are the only ones manipulated.

thanks @cyx and @ericgj that helps out a lot. A few more things im not sure of but a rewalkthrough should clear things up.

One thing however, i see you mentioning the word 'state' a lot. Is this a cuba 'thing' or rack 'thing'. I feel a bit stupid asking :-) but as I said in the beginning - no such thing as a stupid question. What other constants like script_name are there reating to state?

adamakhtar commented 12 years ago

Just realised I never mentioned everyone in this issue. Better late than never so....

oi @codereading/readers just done a walkthrough - come and check it out!

adamakhtar commented 12 years ago

thanks @cyx!

cyx commented 12 years ago

Hi @robodisco,

Good eye regarding the throw as a performance improvement. It's not much, but it was something like a 2-3ms improvement, depending on the number of on statements you have in your app (we tried with around 10 that time).

We used to do it differently and changed it somewhere along 2.x. Here's the sketch commit that I made 10 months ago: fe467d233b2cefdcec862f4adbf44a281605754f

Thanks, cyx

cyx commented 12 years ago

Aside from the performance improvement, it was also a refactoring, since it used to be that run depended on a throw, and the normal flow didn't. We kinda hit 2 birds with one stone by making throw :halt, tuple the defacto way to tell Cuba that, "ok we're done, here's the response". Overall we're still happy with it.