tower-archive / tower

UNMAINTAINED - Small components for building apps, manipulating data, and automating a distributed infrastructure.
http://tower.github.io
MIT License
1.79k stars 120 forks source link

Refactoring / Improving Networking, Models, etc.. (Centralization) #325

Closed thehydroimpulse closed 10 years ago

thehydroimpulse commented 12 years ago

Right now the Tower.Net and the whole networking infrastructure seems a little weird... It would be awesome to be able to have a Tower.NetManager that would be available within Tower.Application.instance().

My idea is to have an array of connections within this class instance that would hold all the information about the user:

class Tower.NetManager extends Tower.Class
  connections: 
    id: 0 # Holds the socket id
    class_instances: [] # Holds all of the class instances for controllers, models, views, etc...
    socket: [] # Holds the normal socket object.
    auth:  
      user: # Current user object, f the user is logged in. Otherwise the auth object would be null
      role: # Current role object that is connected to the user. If the user has multiple roles this could be an array.
      permission: # Current permission object that would be extracted from the authorization module. 
      # This would hold each model and what actions the user has permission to do. e.g
      # permission: 
      #   0: 
      #     model: 
      #     actions_can: ["create", "update"] 
      #     actions_cant:  ["destroy"]

This would allow a really good centralization unit for instances, sockets, ajax, etc...


class Tower.Controller extends Tower.Class
  currentUser: Tower.Application.instance().NetManager.currentConnection.get.id()

Tower.NetManager would present a good DSL to retrieve and use the information provided. It would also be much simpler to access the information as it's extremely centralized.

# Inside some loaded file.coffee (client)
# If your on the client you don't have to use the *.currentConnection.* tag as there is only one connection (you).
# But on the server you would do the following:
currentSocketID = Tower.Application.instance().NetManager.currentConnection.get.id()

# To get the instance of a controller: (client version)
applicationControllerInstance = Tower.Application.instance().NetManager.get.instance("ApplicationController")

Again the DSL would need to be thought out a lot more. But this would allow to retrieve this information from anywhere and would allow building on this extremely easy. You could also provide interfaces to send or emit socket / ajax messages to specific clients (with a restriction system), all clients, etc... Maybe a DSL close to what models have but with less options obviously.

Because at the moment, the current way of manipulating sockets, clients and class instances is a little hacky / clunky.

Thought?

(Maybe i'm going in the wrong direction here and something like this already exists? I just started on actually implementing this... so well see.)

lancejpollard commented 12 years ago

I like the idea of having a centralized interface to all things related to the user:

At the same time this interface - entry point to the "user" stuff - should be as as simple as possible. But, because the server has multiple users it has to keep track of (whereas the client only has one), we can't make it this simple:

App.user # or App.currentUser
App.socket
# ...

because the server would need to wrap that in what you're calling the Tower.NetManager object (currently Tower.NetConnection), so like this:

App.connection.user
App.connection.socket

With that we can also do neat things on the client like wrap those in Handlebars helpers, such as:

Handlebars.registerHelper 'currentUser', ->
  # since behind the scenes `App == Tower.Application.instance()`
  Tower.Application.instance().connection.user

Handlebars.registerHelper 'currentAbility', ->
  Tower.Application.instance().connection.ability

and then in the views do something like this:

{{if currentUser}}
<h1>Welcome {{currentUser.name}}</h1>
<a {{bindAttr logout href=true}}>Logout</a>
{{else}}
<a {{bindAttr login href=true}}>Login</a>
{{/if}}

In terms of the class_instances idea, I like this kind of thing and have been thinking a lot about it too, but it seems to be pretty complex to implement and I'm not sure how much benefit we will get out of it (mainly in terms of performance). For the client it makes sense, it's pretty much what we're doing now. But for the server it gets more complicated. Here are some of the issues on the server.

First, for the controllers, we'd have to refactor all the actions to be passed parameters instead of setting request, response, params, headers, etc. to the controller instance, so actions would go from this (current):

# on the server
class App.PostsController extends App.Application
  index: ->
    App.Post.all (error, posts) =>
      @respondWith (format) =>
        format.html => @render 'index', collection: posts
        format.json => @render json: posts, status: 200

to something like this:

# on the server
class App.PostsController extends App.Application
  index: (request, response, next) ->
    App.Post.all (error, posts) =>
      @respondWith (format) =>
        format.html => response.render 'index', collection: posts, next
        format.json => response.render json: @collection, status: 200, next

The reason is because if we're going to cache the controller object, then what if the current user sends two async requests, and the second one finishes before the first... If they both go to the same instance and set variables, the state would get messed up. So there can't be any state kept in server-side controllers; everything would need to be passed as arguments like app.use (req, res, next) ->. By implementing that pattern, we stick more closely to express' API, and it is cleaner because we can now cache the controller instances. But at the same time, the API becomes more verbose (we have to manually manage a bunch of common/boilerplate code), and the controller actions start revolving around the request and response objects, rather than the controller itself. This would mean - if we want to keep the API on the client and server pretty consistent - we need to introduce the request object on the client, which isn't really necessary (at least it doesn't seem necessary at the moment).

So then it comes down to, what is the benefit we gain by caching the controller instances on the server per-user? Pretty much, it comes down to performance - we could potentially shave off some milliseconds from the request by not creating a controller instance for each request. But that's all speculation, I haven't tested that. And I'm starting to think that it wouldn't be large enough to sacrifice the simplicity of the current controller action API.

That's my main concern in terms of caching controller instances on a per-user basis.

In terms of models, having a central object for each user/socket/session would not only standardize the way all-things-user get accessed (through some net manager/connection object), but it would also allow us to create an identity map. We could scope all the data for a request to the current "connection", so associations/reflections wouldn't have to reload data that's already loaded. I think that's pretty cool. And we can do that with the current or express-style controller action api (above), so we could leave it the same and still accomplish this.

Those are my main concerns/thoughts for the server.

For the client, we should/can definitely cache everything on a central object, and maybe alias it to App, so:

App.currentUser == App.connection.user

One issue with this is Ember already has ideas on how this stuff should work, so we have to work with/around that. For example, they lazily-instantiate controller instances on the App.router object, and we can't change that (at least it seems). So Tower just aliases App.postsController to App.router.postsController essentially. If we made this work around a "connection" object, then we'd have to take that kind of stuff into account.

In terms of caching Ember.View instances on the client, I like that idea but there are lots of things to consider with that. The benefit of doing this is that if you append a view to the DOM a second time, it doesn't have to rebuild it's own DOM, since it was done the first time. This makes rendering noticeably faster [on some browsers, not Firefox or IE it seems though, where it doesn't even matter]. The main issue with this is that all the bindings and such that Ember.View has created stay in memory, so we have to either have the developer manually manage memory, or do some fanciness in Tower, which could get complex. I tried going down that road once but it seemed it would take a lot of time and not produce that much fruit. The cost of manually "disabling" Ember bindings and such on "non-visible" (cached) views quickly becomes just as expensive as just destroying and re-creating the view.

The one situation where it would be useful is if the view doesn't contain much dynamic data (such as a page-specific menu that might get added/removed when the URL changes, but doesn't change it's content). In that situation it may be beneficial to cache the views.

Overall, I agree with you that we should refactor the Tower.NetConnection interface so that it's a cleaner interface to all things user. Seems like you have a solid idea of what should happen and it only really comes down to the specific API. Let me know what you're thinking next, this is definitely a powerful component that will be very useful when simplified.

thehydroimpulse commented 12 years ago

Sounds good, I like where this is headed.

Server Controller Caching: I thought of this as well and it may be simpler not to have it honestly. Dramatically changing the api and constantly adding boilerplate code might not be the best of options but we'll see. Maybe down the road it could be a possibility.

View Caching: It seems like the negatives out weigh the positives in terms of caching. The data-bindings atm seem like a solid one.

So I guess it pretty much comes down to Networking (Sockets, Ajax), Authorization, Authentication, and client-side controller caching and model "caching" (not really caching but still).

Because the client will only have one connection, it [api] may be different in some places. Though in the controllers, models, and views it will all be the same which will make things extremely easy. This will also simplify background jobs, uploads, and other modules that could use this functionality.

Having everything inside the App object is a lot cleaner. Things like App.connection.user or App.connections.user could access the global repository or directory of all connected users. This would possibly increase speed when working with authentication and authorization by caching it between requests.

So I guess the next step is planning the implementation of such centralized system.

  1. Refactor Tower.NetConnection and Tower.connections to be used in a centralized manner
  2. Create the interfaces for storing, accessing, and managing all the resources. Should it be Tower.Manager or something?
  3. Make it easy for modules to plug into this interface. For example the authentication, and authorization modules. (But these will most likely be manually implemented at this stage)

We can start with the sockets and add more things as it comes. But this would greatly change some core Tower stuff. Should this become another branch to build and make it work? Once it's stable and fully works in all conditions it could be merged into master?

This could add some really awesome things into Tower imo and could get better.

edubkendo commented 12 years ago

I like some of where this is going. I definitely think we should keep the controller's how they are. The change seems to add a lot of boilerplate and complexity, without paying off much. Same with the views. Trying to deviate here, and basically undoing a lot of the work ember already does for us, doesn't seem to pay off nearly well enough.

Cleaning up the network stuff, though, sounds like major win to me. Having recently trawled through this repeatedly, I can definitely agree that simplifying this stuff would be really good. Having a single connection object to work with on each client will be really helpful, and makes implementing some security features much easier. After working with the sync-between-browser stuff recently, I know I had concerns and others voiced them as well, about making sure information you don't want populated to every client can be filtered in some way. This would go a long way towards making that do-able.

As far as how to go about working on it: of course it makes sense to work on a topic branch while you work, that way if you want to work on something else you can check out a branch from a clean master to work from that doesn't have the other, unfinished changes in it. But as far as when to merge it in, I'd say we should make a goal of getting a fairly stable API for common uses and things settled and in place as soon as possible, so models, views, controllers, routers, sockets, auth/auth. Then yeah, other more experimental ideas can be worked on more seperately. You mention greatly changing some core tower stuff, where do you mean? It seems like we all pretty much agree that the things which would cause sweeping API changes were probably more trouble than they were worth?

One thing I've been wondering about recently, are there places where tower's internals, or even API, could benefit from using more streams. Mostly because I've been reading a lot of @substack stuff lately, like this: https://github.com/substack/stream-handbook . Way off topic, I know, but just been on my mind and wanted to toss it out there since we're talking about changing stuff. Sorry, random.

thehydroimpulse commented 12 years ago

@edubkendo By Tower core I meant networking (sockets, ajax, etc..).. I kinda count that as being core. But yes I do agree with leaving the controllers and views alone. It won't bring too many benefits right now.

This would bring a huge win imo to Tower.

lancejpollard commented 12 years ago

@edubkendo for sure, we're going to do as much as possible with streams, they are great. Just created an issue so we can discuss it further b/c there's some cool stuff.

lancejpollard commented 12 years ago

Sweet, so it sounds like we're all on the same page. No view caching, and we can worry about controller caching optimizations in the future if it becomes clear there's a need.

@TheHydroImpulse do what you think is best in terms of how you want to approach the problem. I personally like to map out the entire api and [sort of] document it, and then get it working from there (vs. trying to build the api for just one chunk, then when that's done, the next chunk, etc.). It makes it clear the big API changes that we're making, so we can easily discuss/modify/agree on them.

For that Tower.Manager (this is the all-things-user object), I liked the name connection, but it's not perfect. I'm not really a fan of "Manager" in code (Flex's fault). Manager's make MVCs feel too bulky and/or complex in my opinion, and I think it's because of their history coming from [it seems to me] Java Design Patterns. I was thinking Tower.Client but that's confusing, any term related to user is confusing (user, userAgent, agent, client, etc.), even though that's kind of what we're doing. I'm sure we'll think of something. If you really want to change it feel free, but we'll probably come back to it. A lot of the naming has to be cleaned up too (the ugliness of Tower.StoreMongodb and all that is b/c I "flattened" nested namespaces (Tower.Store.Mongodb). Same with Tower.Net.Connection.

One last thing. I recommend implementing in as small chunks as possible. I made the mistake of creating a huge development branch to merge in Ember and it took many many months to complete and nobody really knew what branch to use (and it became too much work to manage both master and development). I see this happen in a lot of projects, it's easy to take 6 months to a year to merge in such a branch.

So instead we should strive [it's an ideal, so it's often not practical, but it's still to be strived for] to spend a few days to a week or two and merge the stuff into master. If you make big API changes those can usually be done in a few days since they don't usually change functionality, so we can push those to master, then work on adding some specific thing, merge that in etc.

Although [counter point], I personally prefer to work on many different things at once (so maybe in this case it would be the API, the emit/broadcast stuff, the jobs/queue code in the model, the controller hooks on model jobs, emitting controller hooks to sockets, etc.). The tendency might be to put those into one branch like net-refactoring and start at it. But it gets overwhelming pretty quickly and often takes too long. Instead lets aim to make each one of those into branches, and merge into the main branch. So:

git branch net-refactoring
git branch emit-broadcast-stuff
git branch jobs-queue-stuff
...

And just merge those into master. If you find it useful to create an intermediate branch to merge all the "net related" stuff into before master (kind of like staging), then try it out, but I have found it slows down my progress the larger the changes are. It's easier to just have small feature branches that you can finish in a week or less [ideal, but still worth striving for].

Not sure if that's what you were asking lol, just some thoughts on branching :)

Really, all this stuff would be cool to have, so it doesn't matter to me what we do first. Let me know when you get started.

Excited!

thehydroimpulse commented 12 years ago

@viatropos Yeah we'll have to get the naming right, but we can always change it later. I'll work around Tower.Manager and see where it goes in terms of naming.

Thanks for the tips on branching n' stuff. I think implementing in smaller working chunks would be easiest.

I'll keep you posted on thoughts, ideas, etc... (I typically write everything in google docs first. Makes it 100% easier)

lancejpollard commented 12 years ago

Cool, sounds good.

lancejpollard commented 12 years ago

Messing around with this a little. One note, you can create a socket "client" from the terminal (or even from within the node server process), so on the "server" portion of Tower there needs to be both client and server "connection" objects.

Implementing basic support where if you modify data in the tower console process, it will sync to the browsers. I started with hook.io, but since that's deprecated I started with require('net') and raw TCP sockets, but then it sends streams of data and you have to create a JSON parser to handle the model syncing stuff. So that's solved by socket.io, and you can require('socket.io-client') in node and use it the same you would from the browser.

thehydroimpulse commented 12 years ago

So I've been thinking on how to best approach this. I messed around with a few things and here is where I got to.

The first goal is to bring a level of centralization to the mix. Knowing exactly where each instance is at all times in all environments (testing, in a controller, in a model, etc..). So basically this is essentially a service locator pattern. This has both pros and cons in it-self. It really becomes a registry type model. (every time I hear that term I think of goddamn Windows and it's bad design. lol)

For staters, this isn't bad. It's just very limited on what it can do. The next option would be dependent on if you like it or not as it's quite different and it seems like it's not very used in the node.js world, though other platforms heavily use it (more noticeably static typed language.)

Dependency Injection would be perfect to fit the bill. Having a DiC (Dependency Injection Container) that would control (in the single context as you could have many containers running at the same time) the dependency injection.

Imagine having a simple controller like the following:

class App.ApplicationController extends Tower.Controller 
  index: ->
    # Some Database call on the users model...
    @render 'index'

Let's say you want to test view rendering within a controller. This would create problems as your depending on the database statically here (kinda). This means your not actually testing the view rendering in your controller but also the dependency it has on the database. If the database fails then so does this test.

A nice solution would be to use a DiC and have some mock objects. You could then nullify the database component in your test code while leaving all your controllers or actual logic alone.

Here's an example on retrieving or resolving a component:

Container.resolve("Controller") # Returns the controller object / instance. 
Container.resolve("Controller").index() # Loads the index method and fails because of the database.
Container.mock 'Database', new (class Database
  find: ->
    null
)
Container.resolve("Controller").index() # Passes successfully because the database doesn't perform any actions.

You could extend this much further by offering a deep-nesting of dependencies. So you could have a chain of dependencies like

App -> Router -> Controller -> View -> Template -> Response ....

But in each and every file you wouldn't have any require calls. You wouldn't have any explicit class calls. You would just in turn go to the container to retrieve and resolve the dependencies.

Of course that syntax may not be "pretty" so you can have an alias like:

App.Controller # = Container.resolve("Controller")
App.Router # = Container.resolve("Router")

Though again this doesn't eliminate the need for a socket manager, but you could easily implement that as another dependency. This also allows for lazy-loading and file maps for all your dependencies (possibly even creating a graphical map outlining everything...mmmmm).

I just wrote a quick DiC / IoC library for this: https://github.com/TheHydroImpulse/Ettore and so far all the features mentioned here are all functional and working (though I still need to create some tests for them).

Next features I'll probably implement are:

Thoughts?

lancejpollard commented 12 years ago

Haven't quite fully wrapped my mind around what you're suggesting, though I get the gist of it.

Let me see if I'm following, basically, the goal is to have a clear set of objects and to manage them with what you're calling the Container. This would make it so you could easily swap out the objects (that are set to variables on Container). This makes things like testing easier because you can then easily swap out mock objects for actual objects (like your view rendering example, mock out the db so you're just testing the rendering, not the db too).

You'd also be able to set nested dependencies pretty easily (from what you're suggesting), such as:

# pseudocode
App.set('Router.Controller.View.Template.Response', new MockResponse)

I definitely agree that the API in Tower should be clean/clear enough that it's easy to mock out objects like this. It's not there yet but that's definitely the direction we're going.

I think Dependency Injection / Inversion of Control are very powerful and useful patterns. They help us define clear separations in our code, and make it easy to do things like mocking (or having swappable datastores for example).

However, I think Dependency Injection is something that should be hidden from the developer using the framework. There should be no docs saying "Dependency Injection". The reason is because, at least to me, it also makes the framework feel bulky and complicated, and brings with it that sense of Java-ness that just overcomplicates things.

I used Dependency Injection a lot when I used Flex (mate framework, swiz framework dependency injection, etc.). Those patterns were definitely the most robust ways to write your code, and it simplified a lot in the codebase. But, it also required a lot of mental overhead - it took a lot to learn and to use, even though it simplified the code you had to write a lot. Maybe that's just me but that's the way I felt about it. I would use it for sure, but I think most people would rather use something simpler.

That said, Tower should strive to have the public API be super simple and not require any [as little] knowledge of design patterns and such as possible. It shouldn't require the use of Dependency Injection. Everyone has to learn MVC/MV* if they want to become good at coding, but I don't think everyone needs to learn Dependency Injection. That kind of thing.

So if the public API is simple I'd be down to explore dependency injection as you're describing it some more, but I think we can accomplish the same thing your suggesting without using those formal design patterns per say.

Thoughts?

thehydroimpulse commented 12 years ago

Exactly! Most frameworks apart from ones built using static typed languages typically hide the DiC. And for some reason, dependency injection is a lot harder to wrap your mind around it. Maybe because it complicates things at first to make it easier in the end, but it is a hard pattern to learn.

This would definitely make it harder for people to get going with Tower if we mention dependency injection everywhere. We could however have an "advanced" page where users comfortable with the topic could learn more about how it works in Tower.

So technically it would seem like a service locator, acting like a bunch of setters and getters through a centralized object. This is fairly easy to understand and is pretty simple to get going with as it's pretty much everywhere. Users would also not need to know about the DiC (Dependency Injection Container at all or it's components. Users would just naturally use the front interface:

App.get("Tower.Controller")
# or
App.resolve("Tower.Controller")
# or
App.Tower.Controller
# or 
Tower.Controller 
# You could have a number of global objects outside of the app but this would just become an alias. But would make things a little easier.

In testing however users could have access to the "mocking" or "faking" functionality of a DiC without actually knowing much about it. Though we would have to explain it to them in a way that isn't strange or weird.

In the end this could be a very powerful pattern to work with, if it's done right. I'm gonna research some more about DiC and many techniques used by other frameworks.

Also this wouldn't be hard to change in Tower as it's a change as you go type of thing. Some things would need to be changed together of course, but overall you could easily incorporate the DiC without much trouble. This also has the ability to lazy-load modules. Though i'd have to look at the file watcher system a little more to really understand how things are uncached or "reloaded" so that it could be brought into or tied into the DiC.

lancejpollard commented 12 years ago

Makes sense, yeah I like where you're going with this! It'll be very powerful when done right.

thehydroimpulse commented 12 years ago

@viatropos So I've been improving the IoC library that I'm making to allow a bunch of useful features. https://github.com/TheHydroImpulse/Ettore

Right now it supports:

Alises This is a cool feature that let's you mimic an array and object. Meaning:

IoC.resolve("Tower.Controller")
# would be the same as writting:
Tower.Controller
# or
App.Tower.Controller
# or 
App.Controller

You'll be able to define how the aliasing works and the naming of it's objects. But in reality calling:

App.Controller

would effectively (in the background) run:

IoC.resolve(IoC.alias("Controller"))

So this could lead to pretty powerful stuff, and in the end, you don't actually touch the IoC up-front, as it's all behind the scenes.

Though I want it to be 100% tested before it goes into Tower or before I start implementing it, into Tower.

lancejpollard commented 12 years ago

Nice! Will dig into it further soon (just took a glance, saw aliases).

Check out Ember.alias too if you haven't, it might be useful.

thehydroimpulse commented 12 years ago

I'll that check that out, thanks.

thehydroimpulse commented 12 years ago

Ok so a little update on the progress....

I never knew how much power Javascript has with the new harmony proxies. Unfortunately not all browsers or javascript engines support this new feature. Node (Because V8 supports it) has it built in, but you need to add a run flag to enable it (kinda a pain). Luckily there are a few libraries out there for browsers and node (node-proxy) to make sure this feature is available. Mainly because Aliases in my library needs this feature.

So I've been working on the aliasing feature letting you do:

Tower.Controller.Action

which actually runs (in the background)

IoC.resolve("Tower.Controller.Action")

As of right now, this successfully works! (albeit a couple of bugs that I need to fix)

The api of creating an alias is very simple and it allows you to control the naming in the end.

So for example, let's say you have the following component:

IoC.set "Tower.Controller.Action", "Success!"

Maybe you don't want to use the full long Tower.Controller.Action every time to access it. Well, with aliases you can define the top-level namespace you want to start with:

Controller = IoC.alias "Tower.Controller"

You can now access the same component with:

Controller.Action # = "Success!"

This lets you have a lot more power over your objects and you can create a slew of namespaces. I believe before you had shortened Tower.Net.Connection to Tower.NetConnection because of javascript objects vs how ember works. I'll have to experiment to know for sure, but I'm thinking you might be able to use traditional naming conventions without shortening the name because they are all stored as strings and are all individual objects.

thehydroimpulse commented 12 years ago

So there are a few issues in this technique, but I'm seeing, if there are other solutions out there.

For starters, there is a limitation on the following issue. We need to find a way to find a difference in the following two calls:

Tower.Controller.Action
Tower.Controller.Action.Index

Because if the first returns a dependency, then we can't call other aliases on it. In this example, the '.Index'. As this system works on a recursive design always returning a proxy object. So we need to somehow, need to find a difference in the previous call, because if we are wanting the Action dependency, we return that, but if we need the Index dependency, then we need to return a proxy object on Action, to be able to call Index.

Kinda a weird problem, but for now we need to use a sorta promise method to return the dependency, if there are naming conflicts in part of the fully qualified naming.

Tower.Controller.Action.then ->
    @get # returns a valid dependency.
Tower.Controller.Action.Index # returns a valid dependency.
thehydroimpulse commented 12 years ago

Ok, I have a kind of buggy solution, but one that might work under certain naming conventions. The rule of naming currently is plural vs singular fashion. This is only one naming option available.

Singular - Getting Dependencies

If you want to access the dependency, you would use the singular version of the word. e.g

Tower.Controller # Returns dependency

Plural - Chaining

To continue going through the proxy chain to go deeper in naming to access another dependency then you use a plural version of each word. e.g

Tower.Controllers.Action # Returns a dependency, because "Action" is singular, but keeps the chaining going because "Controllers" is plural.

I guess for now this would work, until I can find a better alternative. But also keep in mind that the "actual" stored naming is not effected by the different of plural vs singular.

# The keys are the same as before with traditional naming conventions. Only calling these dependencies through 
# the aliasing feature will you have to use; plural vs singular methods.
IoC.set "Tower.Controller.Action", {default: "index"}