LouisJenkinsCS / WaterQuality

0 stars 4 forks source link

Design Model #1

Open LouisJenkinsCS opened 7 years ago

LouisJenkinsCS commented 7 years ago

My proposed design mode.

wr2f80m - imgur

LouisJenkinsCS commented 7 years ago

Soon I'll add some specification for the User object in general. In particular there would need to be some additional stateful information. Such as, is it in a connected state, is it awaiting a web page that requires data from the database, is it in the middle of a transaction, did it modify its account information, is it currently signed in. Etc.

Stuff like that. From this information we can move forward from one state to another.

LouisJenkinsCS commented 7 years ago

image

New update for control flow diagram.

Just like before, the Client Browser submits an HTTP request, which of course is received by the ControlServlet. The ControlServlet, beyond basic processing, immediately creates an AsyncContext that will handle the rest of processing so we may accept the next client. The RequestProcessor will handle all requests regarding the current client.

It is important to note that while we may have improved the amount of clients we can take on at once, this does not mean that we are scalable yet. The issue with this is that everything is broken down into short (and stateless) tasks. If we had one long-running task, such as having the task be to handle 100% of the request, then we'd end up with the same problem we're trying to avoid. If we have each task handle, say, 10% each, then we can form a sort of pipeline.

Example: Lets say that the user wants to get the graph of certain parameters (say temperature) between 10AM and 12PM. With the current model, this will happen...

  1. Client sends HTTP request...
  2. ControlServlet receives HTTP request on HTTP Thread A and forwards asynchronously to RequestProcessor
  3. RequestProcessor needs to access data first, so it tells worker thread B that it needs Data between 10AM and 12PM
  4. B starts in DataModule and asks DataManager for data between 10AM and 12PM
  5. B is in DataManager, looking to see if it has cached that data. If not, it asks DatabaseManager for all date between 10AM and 12PM. At this time, A is servicing a new client connection
  6. B gets the data needed and caches that data for later, and returns it up through DataModule which returns it back up to RequestProcessor. At this time, B received another request for graph data from A.
  7. A obtains data and forwards it to JSPModule which will be handled in worker thread C.
  8. C obtains data and defers work on generating the graph to another worker thread D.
  9. D finishes formatting graph, returning it to C in JSPModule
  10. C, now that it has the graph data, processes it and return the JSP page in JSPGenerator
  11. A retrieves the JSP page, and forwards dispatching to HTTPDispatcher which sends the HTTP Response to client.

Now all of this is complicated, and it definitely is non-trivial... if we weren't use another library to help with this for us. I chose RxJava because I have experience with it, and it is actually designed to handle certain backpressure and to evenly distribute work among it's worker threads. This is why I added it as a library to the project. Another reason why all parts must be modular enough to handle returning data, and not have side-effect induced behavior.

As well, External Data is data sent to the server, which the DatabaseManager handles.

LouisJenkinsCS commented 7 years ago

I also designed it so that each module follows the "Single Responsibility Principle" in that they should do one thing, and one thing well. This also makes unit testing easier as we can test each individual module, and they should be purely stateless except for reliance on the database. As well, the database need to support atomic transactions.

LouisJenkinsCS commented 7 years ago

image

LouisJenkinsCS commented 7 years ago

Control Flow Diagram

Diagram

Below is a diagram depicting the flow of control that should occur during the processing of a user's request. The letters A, B, C, and D correspond to where a thread will 'pass off' work to another so that it may forward progress for another client. The reason 'Why' are explained on in this document, however know that each letter does not correspond to a dedicated thread, but to where the current stage of the pipeline ends to where another begins. It is possible for a thread, if it is idle, to pick up the same task.

image

Why

The primary issue, especially for a public service, is that it needs to scale. While, perhaps realistically it may be that the number of expected users are low enough for it not to matter, I will bring up a very realistic scenario. Let us assume that we use a bounded number of threads N, and that there are N users that need to be attended concurrently. Now assume that all N threads are busy obtaining data to display a graph of data, and that another client, the N+1 client requests data. While true, a context switch could ensure that the client is served promptly, this is not without it's own overhead. As well, what about when you 2N users? 10N? 100N? Eventually the server will grind to a halt due to traffic, even under realtistic numbers.

Why is my model better than Apache Tomcat's default servlet implementation? It is, as documented, using a bound thread pool, and will block and not process any more connections, meaning that while the currently processed connections are assured to be processed quickly, it's queue of waiting connections are not. My proposed model of 'passing off' work to multiple threads (as in, multiplexing many connections on a bounded number of threads) ensures that ALL connections are eventually serviced. This principle defines the movement towards better concurrency models such as the focus on wait-freedom: "...every operation has a bound on the number of steps the algorithm will take before the operation completes".

Control Flow

On to the actual control flow, I will include an example to demonstrate what I mean, imagine that the User wishes to see the chart for Temperature between the given time frame of 10AM and 2PM.

Client C1 sends HTTP Request to our web page. This request is received on a thread A. (Realistically, A would be Apache's thread pool)

Thread A Thread B Thread C Thread D
C1,ControlServlet IDLE IDLE IDLE

The request gets processed in ControlServlet which will create an AsyncContext and defer further processing to RequestProcessor.

Thread A is finished and goes back to accept another client C2.

The RequestProcessor is operating on thread B and servicing C1

Thread A Thread B Thread C Thread D
C2,ControlServlet C1,RequestProcessor IDLE IDLE

The RequestProcessor obtains the required information and passes off to JSPModule with the Pair<String, TimeFrame> on Thread C

Thread B is awaiting the return from Thread C to handle C1, which frees it up to process the request from C2.

Thread A is finished and goes back to accept another client C3. (Last time I'll bring up other threads explcitly, but the graphs will still track their status)

The RequestProcessor is servicing client C2.

Thread A Thread B Thread C Thread D
C3,ControlServlet C2,RequestProcessor C1, JSPModule IDLE
IDLE IDLE IDLE IDLE

The JSPModule calls GraphGenerator to obtain the graph to show.

Thread A Thread B Thread C Thread D
C4,ControlServlet C3,RequestProcessor C1,GraphGenerator IDLE
IDLE IDLE C2,RequestProcessor IDLE

GraphGenerator awaits the return of the DataSet from DatabaseManager that is on Thread D.

Thread A Thread B Thread C Thread D
C5,ControlServlet C4,RequestProcessor C2,JSPModule C1,DatabaseManager
IDLE IDLE C3,RequestProcessor IDLE

DatabaseManager retrieves the DataSet and returns it.

Thread A Thread B Thread C Thread D
C6,ControlServlet C5,RequestProcessor C2,GraphGenerator IDLE
IDLE IDLE C3,RequestProcessor IDLE
IDLE IDLE C4,RequestProcessor IDLE
IDLE IDLE C1, DatabaseManager IDLE

Some time later, GraphGenerator constructs it's JFreeChart and returns it. (Client states are predicted)

Thread A Thread B Thread C Thread D
C11,ControlServlet C10,RequestProcessor C1,JSPModule C3,DatabaseManager
IDLE C9,RequestProcessor C2,JSPModule C4,DatabaseManager
IDLE C8,RequestProcessor C7,RequestProcessor IDLE
IDLE IDLE C6,RequestProcessor IDLE
IDLE IDLE C5, RequestProcessor IDLE

JSPModule calls JSPGenerator and passes JFreeChart as parameter

JSPGenerator constructs the JSP page and returns it

JSPModule returns the JSP page

Thread B picks up where it left off, and dispatches it. (Fast Forward...)

Thread A Thread B Thread C Thread D
C20,ControlServlet C1,HTTPDispatcher C16,JSPModule C11,DatabaseManager
C21,ControlServlet C2,RequestProcessor C15,JSPModule C10,DatabaseManager
IDLE C19,RequestProcessor C14,RequestProcessor C9,DatabaseManager
IDLE C18, RequestProcessor C13,RequestProcessor IDLE
IDLE C17,RequestProcessor C12, RequestProcessor IDLE

Conclusion

What am I trying to present with the above example? That unlike the default implementation of using a bound thread pool where we dedicate one thread per connection, we instead can service multiple connections at once by doing each little-by-little. In the above example we had 4 threads, but could theoretically service 21 connections concurrently.