codereading / rack

a modular Ruby webserver interface
http://rack.rubyforge.org/
Other
18 stars 2 forks source link

Once again, where to start? #6

Open jessepollak opened 12 years ago

jessepollak commented 12 years ago

There are so many different source files in the lib/rack/ folder and I feel like a lot of them are interconnected. Does anyone have a good strategy for where to start and how to move through them?

Thanks!

ericgj commented 12 years ago

One place to start is from the place you'd typically interact with Rack in a web application -- which is to say, the config.ru rackup file. How does that work? What code runs that file, where in Rack is the DSL for use and run implemented, and how does it work to hook up requests from the web server to your application?

samnang commented 12 years ago

I have started a thread in google group about this topic, and I share my thought there:

https://groups.google.com/forum/#!msg/codereading/40PYinWmMa0/a-1P--HHZcIJ

I'm not sure should we keep discussion here or in google group?

skade commented 12 years ago

Actually, not so many of them are connected. Because most Rack middlewares are single-file affairs but live under the Rack namespace (and not Rack::Middleware), a lot of them can be found in this folder. These are loosely coupled at best, most of them are completely independent.

I would start with: Handler, Server, Builder and Utils

codereading commented 12 years ago

Samsang thanks for posting that topic. It answers a question that im sure many people here in the group are wondering.

Perhaps copying it over here is a better home for it as it is is rack specific.

But I think we do need a topic in the main google group for a general "How do i start".

How about just editing your google group post a bit such as the title to "How do I start?"

I really wish we could have all our conversation in one place with the ability to group conversations around projects and have the features github provides. It feels a bit disjointed but I guess for now we will just have to put up with it.

Thanks again for the great post.

codereading commented 12 years ago

Hi Jesse,

Ill try to explain how I start. Im writing this for everyone though so forgive me for my dummy style prose but it might come in handy for someone down the line.

Regarding where to start I usually create a very very simple example app that utilizes the gem's core feature and ignores the fancy parts. Using that, I'll look for an entry point and try to follow the code right to the end. That's usually enough for me to orientate myself in the code without getting lost in features. And after that I can start to explore some of the more interesting use cases by expanding the example code.

With Rack it's core feature is allowing you to return a response to a simple request. The fancy parts would be its ability to stack middleware together.

I guess the simplest of apps to use as an example for code reading would be a hello world app, an app which given a request to "localhost:3000" simply displays hello world.

In rack you could achieve this by passing it a proc such as this

my_rack_proc = lambda {|env| [200, {}, ["hello world"]] }

Thats an array including a status of 200, empty header hash (you could put "content-type" => "text/plain" etc here) and the content hello world.

That's the response but how about actually creating the rack app. Well there are a few ways you could create a simple hello world app and I'll let everyone read up on that here http://gallery.mailchimp.com/e49655551a5bb47498310c7de/files/RackIntro.pdf. But a common way rack apps are built is to make use of rackup.

Rackup comes with rack and is a dsl. It's job is to make it easy to configure rack middleware together. It also understands your environment so can automatically pick and run a server for you i.e. WEBrick etc. Theres no middleware for helloworld but the later reason is good enough to use this approach. And since there's little code to write and it's common practice its probably good for code reading purposes.

Rackup expects a config file by the name of config.ru and you simple pass your proc to rackup#run

#config.ru
require './my_app'
run lambda {|env| [200, {}, ["hello world"]] }

then in your console

rackup config.ru

check your broswer at the relevant port and bam hello world.

So now for code reading. Looking at the code I'd say the entry point would be the run command. I'd go looking in the source code for that. I'd that see what happens to the proc and how that ends up getting displayed back in the browser.

Of course another entry point would be trying to figure out the first point rack receives the request from the server, tracing that through untill you reach the run method above.

One way you could do that would be (besides just scouring the source code) to use pry or ruby-debugger ( for 1.9.3 i think you need it's cousin debugger ). You could put a break point at some high point in the source code, make the request in your browser and execution would stop and you can step through finding your way.

One common need for a rack app would be the ability to return different content depending on the request url. If you look at this very simple app https://github.com/rack/rack/wiki/Rack-app-with-uri-and-HTTP-specific-responses demonstrating this you'll see rack provides some helper methods to query the request. That might be worth looking at, it might not be the most interesting of code reading but you'll find helpers in there you could use in your own apps.

Lastly comes the juicy part. I havent even got to here yet but Rack is famous for the way it allows middleware to be stacked. I think it utilizes the decorator pattern to do this. This is actually one of the things im most interested in studying. Id create some middleware of my own, configure it in config.ru and again using pry or debugger figure out how it all comes together.

Hope this is useful. If you're still stumped feel free to ask more questions!

samnang commented 12 years ago

@codereading I clone my topic from google group.

So here what I started with:

How about just editing your google group post a bit such as the title to "How do I start?"

That's a good idea, but I think we should create a new post with that topic, so it's not tight coupling with feature project in it.

agis commented 12 years ago

Very helpful info, however I find myself stuck at this point:

def run(app)
  @run = app
end

That's the entry point you were talking about, however I don't know how to proceed after this. I understand what this method does (assigns to the instance variable @run the app object), but how would I proceed from here? Where is this @run variable used? Why it's used like this?

ericgj commented 12 years ago

@Agis, it's a little tricky to follow since rack gives you various entry points into building an app. Did you follow the path from bin/rackup? You'll see that the place that the @run and @use are used is in #to_app (which is called when evaluating the config.ru file). That method is the key to understanding the whole pipeline - in particular https://github.com/rack/rack/blob/master/lib/rack/builder.rb#L130,

@use.reverse.inject(app) { |a,e| e[a] }

Keeping in mind that for simple cases, this can be reduced to

@use.reverse.inject(@run) { |a,e| e[a] }

It takes a while to puzzle out how this line works to create the rack pipeline, but IMO it's fundamental to understanding the rest of the codebase, & how middleware work.

agis commented 12 years ago

Hm I see. If I'm getting it right, @use is an array that contains all the middlewares in the stack, that are going to be used.

But I'm not sure about what the inject is doing. I think it's inserting the app variable passed in #run, into the array and then accesses it using e[a]?

It's confusing :P

skade commented 12 years ago

@use is an array of procs that each take an app and build another app out of it by prepending a middleware (a rack app with a middleware in front of it is a rack app). This proc is called on each iteration of the inject and returns an app.

foo[a] is an alias for foo.call(a) if foo is a proc.

https://github.com/rack/rack/blob/master/lib/rack/builder.rb#L77-83

adamakhtar commented 12 years ago

Is this the general way rackbuilder works then. Those few lines with inject and use seem to play a massive role

If we have define three middlewares A,B and C like so for brevity I'll only define A but B and C are exactly the same except they customize the response to state themselves

class A
  def initialize(app)
    @app = app
  end

  def call(env) 
    status, headers, response = @app.call(env)
    [status, headers, "<!-- Middleware A -->\n" + response.body] #the middleware message will differ accordingly for B and C
  end
end

and define our rackbuilder like so

use A
use B
use C
run RailsApp

Then when a request for "posts/index" comes from the client rack will eventually call rackbuilder#call This triggers a sequence of methods such as #to_app and #use. Working backwords:

rackbuilder#use will create an array of procs. Each proc will be responsible for instantiating the middleware defined above in the config.ru file. I.e

[ {|app_param| A.new(app_param) }, {|app_param| B.new(app_param) }, {|app_param| C.new(app_param) }]

again working backwards, #use is used here in #to_app

@use.reverse.inject(app) { |a,e| e[a] }

on the first iteration app will be the main app i.e. the rails app. Looking at the procs in array above, the middleware class "C" is then instantiated with the railsapp. Due to injects accumulation effect, the second iteration will result in B initialized with an instance of C as the app and so on. i.e. something equiv to this

1st iteration c = C.new(railsapp)
2st iteration b = B.new(c) 
3st iteration a = A.new(b) 

This results in a chain of middlewares getting instantiated. Each middleware will have an attribute @app that is set to an instance of the middleware above it in the list of use statements in config.ru. The last middleware A will be the last result of the inject method and is returned to the calling method.

This calling method is rackbuilder#call https://github.com/rack/rack/blob/master/lib/rack/builder.rb#L133

which immediately calls A's own call method.

Assuming the request is for a posts/index action A would immediately call @app.call(env). In this case A's @app holds an instance of B. And B.call results in C.call which results in calling the rails app.

The rails app resturns [200, {}, ["<h1>All Posts</h1>"]]

We return to C's call method where it cusomizes the returned response with its own message. [200, {}, ["<--- middleware C ==>\n<h1>All Posts</h1>"]]

B and A do the same resulting in a final response of [200, {}, ["<--middleware A -->\n<--middleware B -->\n<--- middleware C ==>\n<h1>All Posts</h1>"]]

Is that about right?

agis commented 12 years ago

The logic is kinda complex isn't it?

So the order in which the middlewares are called through #use is important and matters. But they're ran after the main app (the one that is called by #run)?

For example, we have a Rails app and I'm using OmniAuth which is another Rack middleware. So if we were going to implement this in a config.ru, it would look like this:

use OmniAuth  # 1st middleware (also a rack app)
run RailsApp  # our main rack app

So the RailsApp would be instatiated first and then the OmniAuth middleware would follow, providing its functionality. And if we were going to add another middleware like this:

use OmniAuth  # 1st middleware (also a rack app)
use Middleware # our new middleware
run RailsApp  # our main rack app

then the OmniAuth would be instatiated after the new middleware.

Is that right?

agis commented 12 years ago

I've done some more research and I now believe that the middlewares are run first and the app is the last that runs.

Based on this talk: http://confreaks.com/videos/49-mwrc2009-in-a-world-of-middleware-who-needs-monolithic-applications and a rake middleware inspection in one of my rails apps.

skade commented 12 years ago

Middlewares can run before and after an application. It all depends on where your middleware does its work:

class MyMiddleware
  def call(env)
     do_stuff_before(env)
     status, headers, body = @app.call(env) # call next middleware
     do_stuff_after(status, headers, body, env)
   end
end

Imagine the Middleware stack as a linked list: each middleware knows the next one, but nothing else. Control is passed by calling the next middleware. At the end, control is passed back when the call stack is unwound. As long as every call to #call returns (status, headers, body), the middleware is free to do anything it wants with the environment.

There are some subtleties to all this. First of all, env is not copied on call except you explicitly do so. Manipulations to env happening after @app.call will affect everything you do after that. This can be exploited to write information into the environment. The cookie-session middleware is an example: it sets a special key on call and reads the state of the key on the way back end sets a cookie to save that state.

The second one is that Middlewares and the application are singletons: you #run one instance of the application and each #use constructs one instance of the middleware. This is done at bootup, not for every request. This is very efficient, but some frameworks (e.g. Sinatra) save state in their application object that needs to be collected. This is easily dodged, for example by using the #call/#call! pattern used in Sinatra: Sinatra uses the application class itself as app object, which immediately constructs an instance of itself on #call. Its similar to this code (+ some Sinatra specialities):

class MyApp
  def self.call(env) #class method!
    new(env).call!
  end

  def call!(env) # use call! to announce that this should only be called once
    #do_your_work
  end
end

For middlewares, which get constructed at stack build time to ensure that @app is known, you can use clone for a similar effect. This should be rare, as middlewares should be able to get away with storing everything in local variables. In a middleware, try to keep your hands away from writing to instance variables unless you know exactly what you are doing:

class Middleware
  def call(env)
    self.clone.call!(env)
  end

  def call!(env) # be aware that env is still not copied
    @dirty = something(env)
  end
end

Hope that clears things up a little.

Regards, Florian

agis commented 12 years ago

I'm feeling grateful that you're around.

Thank you.