Open LouisJenkinsCS opened 7 years ago
Soon I'll add some specification for the User object in general. In particular there would need to be some additional stateful information. Such as, is it in a connected state, is it awaiting a web page that requires data from the database, is it in the middle of a transaction, did it modify its account information, is it currently signed in. Etc.
Stuff like that. From this information we can move forward from one state to another.
New update for control flow diagram.
Just like before, the Client Browser submits an HTTP request, which of course is received by the ControlServlet
. The ControlServlet
, beyond basic processing, immediately creates an AsyncContext
that will handle the rest of processing so we may accept the next client. The RequestProcessor
will handle all requests regarding the current client.
It is important to note that while we may have improved the amount of clients we can take on at once, this does not mean that we are scalable yet. The issue with this is that everything is broken down into short (and stateless) tasks. If we had one long-running task, such as having the task be to handle 100% of the request, then we'd end up with the same problem we're trying to avoid. If we have each task handle, say, 10% each, then we can form a sort of pipeline.
Example: Lets say that the user wants to get the graph of certain parameters (say temperature) between 10AM and 12PM. With the current model, this will happen...
ControlServlet
receives HTTP request on HTTP Thread A and forwards asynchronously to RequestProcessor
RequestProcessor
needs to access data first, so it tells worker thread B that it needs Data between 10AM and 12PMDataModule
and asks DataManager
for data between 10AM and 12PMDataManager
, looking to see if it has cached that data. If not, it asks DatabaseManager
for all date between 10AM and 12PM. At this time, A is servicing a new client connectionDataModule
which returns it back up to RequestProcessor
. At this time, B received another request for graph data from A.JSPModule
which will be handled in worker thread C.JSPModule
JSPGenerator
HTTPDispatcher
which sends the HTTP Response to client.Now all of this is complicated, and it definitely is non-trivial... if we weren't use another library to help with this for us. I chose RxJava because I have experience with it, and it is actually designed to handle certain backpressure and to evenly distribute work among it's worker threads. This is why I added it as a library to the project. Another reason why all parts must be modular enough to handle returning data, and not have side-effect induced behavior.
As well, External Data is data sent to the server, which the DatabaseManager
handles.
I also designed it so that each module follows the "Single Responsibility Principle" in that they should do one thing, and one thing well. This also makes unit testing easier as we can test each individual module, and they should be purely stateless except for reliance on the database. As well, the database need to support atomic transactions.
Below is a diagram depicting the flow of control that should occur during the processing of a user's request. The letters A
, B
, C
, and D
correspond to where a thread will 'pass off' work to another so that
it may forward progress for another client. The reason 'Why' are explained on in this document, however know that each letter does not correspond to a dedicated thread, but to where the current stage of the pipeline ends
to where another begins. It is possible for a thread, if it is idle, to pick up the same task.
The primary issue, especially for a public service, is that it needs to scale. While, perhaps realistically it may be that the number of expected users are low enough for it not to matter, I will bring up a very realistic
scenario. Let us assume that we use a bounded number of threads N
, and that there are N
users that need to be attended concurrently. Now assume that all N
threads are busy obtaining data to display a graph of data,
and that another client, the N+1
client requests data. While true, a context switch could ensure that the client is served promptly, this is not without it's own overhead. As well, what about when you 2N
users? 10N
?
100N
? Eventually the server will grind to a halt due to traffic, even under realtistic numbers.
Why is my model better than Apache Tomcat's default servlet implementation? It is, as documented, using a bound thread pool, and will block and not process any more connections, meaning that while the currently processed connections are assured to be processed quickly, it's queue of waiting connections are not. My proposed model of 'passing off' work to multiple threads (as in, multiplexing many connections on a bounded number of threads) ensures that ALL connections are eventually serviced. This principle defines the movement towards better concurrency models such as the focus on wait-freedom: "...every operation has a bound on the number of steps the algorithm will take before the operation completes".
On to the actual control flow, I will include an example to demonstrate what I mean, imagine that the User wishes to see the chart for Temperature between the given time frame of 10AM and 2PM.
Client C1
sends HTTP Request to our web page. This request is received on a thread A
. (Realistically, A
would be Apache's thread pool)
Thread A | Thread B | Thread C | Thread D |
---|---|---|---|
C1,ControlServlet | IDLE | IDLE | IDLE |
The request gets processed in ControlServlet
which will create an AsyncContext
and defer further processing to RequestProcessor
.
Thread A
is finished and goes back to accept another client C2
.
The RequestProcessor
is operating on thread B
and servicing C1
Thread A | Thread B | Thread C | Thread D |
---|---|---|---|
C2,ControlServlet | C1,RequestProcessor | IDLE | IDLE |
The RequestProcessor
obtains the required information and passes off to JSPModule
with the Pair<String, TimeFrame>
on Thread C
Thread B
is awaiting the return from Thread C
to handle C1
, which frees it up to process the request from C2
.
Thread A
is finished and goes back to accept another client C3
. (Last time I'll bring up other threads explcitly, but the graphs will still track their status)
The RequestProcessor
is servicing client C2
.
Thread A | Thread B | Thread C | Thread D |
---|---|---|---|
C3,ControlServlet | C2,RequestProcessor | C1, JSPModule | IDLE |
IDLE | IDLE | IDLE | IDLE |
The JSPModule
calls GraphGenerator
to obtain the graph to show.
Thread A | Thread B | Thread C | Thread D |
---|---|---|---|
C4,ControlServlet | C3,RequestProcessor | C1,GraphGenerator | IDLE |
IDLE | IDLE | C2,RequestProcessor | IDLE |
GraphGenerator
awaits the return of the DataSet
from DatabaseManager
that is on Thread D
.
Thread A | Thread B | Thread C | Thread D |
---|---|---|---|
C5,ControlServlet | C4,RequestProcessor | C2,JSPModule | C1,DatabaseManager |
IDLE | IDLE | C3,RequestProcessor | IDLE |
DatabaseManager
retrieves the DataSet
and returns it.
Thread A | Thread B | Thread C | Thread D |
---|---|---|---|
C6,ControlServlet | C5,RequestProcessor | C2,GraphGenerator | IDLE |
IDLE | IDLE | C3,RequestProcessor | IDLE |
IDLE | IDLE | C4,RequestProcessor | IDLE |
IDLE | IDLE | C1, DatabaseManager | IDLE |
Some time later, GraphGenerator
constructs it's JFreeChart
and returns it. (Client states are predicted)
Thread A | Thread B | Thread C | Thread D |
---|---|---|---|
C11,ControlServlet | C10,RequestProcessor | C1,JSPModule | C3,DatabaseManager |
IDLE | C9,RequestProcessor | C2,JSPModule | C4,DatabaseManager |
IDLE | C8,RequestProcessor | C7,RequestProcessor | IDLE |
IDLE | IDLE | C6,RequestProcessor | IDLE |
IDLE | IDLE | C5, RequestProcessor | IDLE |
JSPModule
calls JSPGenerator
and passes JFreeChart
as parameter
JSPGenerator
constructs the JSP page and returns it
JSPModule
returns the JSP page
Thread B
picks up where it left off, and dispatches it. (Fast Forward...)
Thread A | Thread B | Thread C | Thread D |
---|---|---|---|
C20,ControlServlet | C1,HTTPDispatcher | C16,JSPModule | C11,DatabaseManager |
C21,ControlServlet | C2,RequestProcessor | C15,JSPModule | C10,DatabaseManager |
IDLE | C19,RequestProcessor | C14,RequestProcessor | C9,DatabaseManager |
IDLE | C18, RequestProcessor | C13,RequestProcessor | IDLE |
IDLE | C17,RequestProcessor | C12, RequestProcessor | IDLE |
What am I trying to present with the above example? That unlike the default implementation of using a bound thread pool where we dedicate one thread per connection, we instead can service multiple connections at once by doing each little-by-little. In the above example we had 4 threads, but could theoretically service 21 connections concurrently.
My proposed design mode.