jolie / jolie

The Jolie programming language
https://www.jolie-lang.org/
GNU Lesser General Public License v2.1
540 stars 53 forks source link

Concurrent provide-until #14

Closed fmontesi closed 9 years ago

fmontesi commented 9 years ago

Hi all,

Provide-until is already full of surprises. Here goes a crazy idea for dealing with a problem I encountered while using it.

Problem

Suppose you have a process that implements a chat room. It works like this: first you open the chat room; then, operations getHistory (for reading the content of the chat room) and postMessage (for posting a new message) are provided until operation closeChatRoom is called. We can elegantly encode this with the provide-until statement:

createChatRoom( room );
history = "";
provide
  [ getHistory()( history ) { nullProcess } ]
  [ postMessage( msg )() { history += msg + "\n" } ]
until
  [ closeChatRoom() ]

Nice, but inefficient: this is a typical readers/writers problem, but provide-until is currently always sequential. We should instead allow getHistory to be invoked in parallel, as it doesn't change our shared state. Differently, postMessage should indeed be sequential (no other threads should be changing the value of history while it is operating).

Solution 1 - Concurrent provide-until

We could introduce the option of telling provide-until which operations should be provided concurrently and which sequentially, for example:

provide!
  [ getHistory()( history ) { nullProcess } ]
provide*
  [ postMessage( msg )() { history += msg + "\n" } ]
until
  [ closeChatRoom() ]

The semantics of this would be:

The solution above works for typical readers/writers but there may be scenarios where it doesn't suffice. In such cases, the programmer may want to specify a custom policy for telling Jolie when an operation should become available, maybe using internal links. Assume that all provide operations can go in parallel, then here is an example of how to realise the scenario (forgive the long syntax):

/* This block is called R,
    and is active only when no operation calls are being processed in block W */
provide[R requires W]
  [ getHistory()( history ) { nullProcess } ]

/* This block is called W,
    and is active only when no operation calls are being processed in blocks R and W */
provide[W requires R,W]   [ postMessage( msg )() { history += msg + "\n" } ]

// This blocks is active only when no other operation call is in progress in the other blocks
until
  [ closeChatRoom() ]

Solution 3 - Asynchronous Parallel

Another solution could be to use recursion. The current provide-until block can be simulated with recursion, for example the following

provide
  [ getHistory()( history ) { nullProcess } ]
  [ postMessage( msg )() { history += msg + "\n" } ]
until
  [ closeChatRoom() ]

can be implemented as:

define X {
  [ getHistory()( history ) { nullProcess } ] { X }
  [ postMessage( msg )() { history += msg + "\n" } ] { X }
  [ closeChatRoom() ]
}

This, however, fails if we want concurrency for getHistory:

define X {
  [ getHistory()( history ) { X | nullProcess } ]
  [ postMessage( msg )() { history += msg + "\n" } ] { X }
  [ closeChatRoom() ]
}

The reason is that X | nullProcess will make getHistory wait for the execution of X, which is not what we want. To get it right, we would need X to be able to proceed in parallel and not wait for it. Suppose to have this operator, || :

define X {
  [ getHistory()( history ) { X || nullProcess } ]
  [ postMessage( msg )() { history += msg + "\n" } ] { X }
  [ closeChatRoom() ]
}

Now X is executed and getHistory returns immediately, without waiting for X to terminate. But wait, now postMessage can be active before some getHistory call is still computing. So we would also need a magical operator for postMessage to wait for all current parallel threads handling getHistory to terminate:

define X {
  [ getHistory()( history ) { X || nullProcess } ]
  [ postMessage( msg )() { wait_for_all_getHistory(); history += msg + "\n" } ] { X }
  [ closeChatRoom() ]
}

Conclusions

I hope you agree that provide-until can become more useful by powering it up a little. If not and you have an idea of how to do this with something we already have, I would be even happier because it would entail less development. ;-) I showed some different potential solutions. Which to follow? For now my preference is option 1 over 2 over 3. I dislike 3 as it is less declarative and simple, but we should always think about what we can do already before introducing something new, albeit convenient. I find it really interesting that while provide-until right now (let's call it sequential provide-until) is just syntax sugar, it looks like that handling concurrency would make it a full-standing primitive in its own right.

A side note on execution modalities

It looks like having option 1 or 2 would effectively mix our three execution modalities (concurrent, sequential, and single) in a single primitive in a way that makes sense. I really like this, and maybe understanding this could pave the way for mixing execution modalities in the same service in a similar way, in the next-next-version.

klag commented 9 years ago

I am not convinced about the usefullness of concurrent provide-until primitive. I think that the problem should be approached at the level of architecture instead of at the level of single service.

In this case there should be a chatService that is in charge to store all the data of a given chat and then a workflowService that is in charge to provide the flow.


chatService


[ getHistory()() ...

[ postMessage()() ...

[ closeChatRoom()() ...


workfowService


outputPort chatServicePartial { RequestResponse: getHistory }

inputPort .., { RequestResponse: postMessage, closeChatRoom Aggregates: chatServicePartial }

provide postMessage until closeChatRoom

the workflowService protects the resource service chatService by governing the access to its operations by using a provide until in the case of postMessage and closeChatRoom, whereas in the case of getHistory the message are directly forwarded to the chatService.

As far as the correlation sets are concerned, in the chatService you will have an identifier which targets a specific chat (à la REST), whereas the workflow will manage the interaction sessions. It could be nice to have the same identifier of the chatService as the session id of the workflow service.

I would investigate these last issue of propagating correlation sets instead of making the provide until more complex. Distributed computing is a difficult task from the design point of view. I would leave simple primitives for single services and I would shift complexity at the level of architecture. This is particularly important from the point of view of software engineering because simple services must deal with simple functions (protecting a data source, providing some kind of computation functions, etc) whereas orchestrators and workflow services should deal with flows at the level of architecture.

fmontesi commented 9 years ago

I agree with the philosophy (what I need to implement is the workflow service, imagine use an external service for the history variable). I did not understand your comment about the cset though: since the data (resource) service needs to give me access in parallel, I need everything on top using concurrent execution or I would end up with a very ugly recursive-parallel solution inside of the data service behaviour.

However, your solution does not work because the workflow service will allow me to invoke getHistory and postMessage in parallel. What I want is readers/writers, so only getHistory get be invoked multiple times in parallel, but never with postMessage. Your implementation only prevents me from invoking postMessage in parallel.

However again, this suggests another possible solution.

The only efficient way we have to do proper readers/writers right now is through an external Java service that implements a readers/writers lock in Java, which we access from Jolie using a solicit response for acquiring a lock and then another operation for releasing the lock.

I did not want to use this solution because acquiring and releasing locks is quite error-prone, and I surely do not want to have to handle that in my everyday workflows (and situations like this chat thing happen often). I'm referring to code like this, which I don't like particularly:

[ getHistory()() { reader@Lock()(); ...; release@Lock()() } ] [ postMessage()() { writer@Lock()(); ...; release@Lock()() ]

Using courier processes we could make this less error-prone though, by designin a generic readers-writers service that secures the backend data service. We could make a ReaderIface and a WriterIface interfaces, and the courier for the first would invoke reader@Lock and the other writer@Lock. This solution scales because if I change the interfaces I do not need to change the code in the courier processes.

This would remove every workflow though. We would not see the lifespan of a chat anymore, which does not thrill me. I like that concurrent provide-until makes me see the lifecycle of a chat declaratively. Instead, with this courier solution I proposed building on yours, we would get that the data service would be a big input choice handling a global variable which we would have to update manually. I am also afraid that this would not scale very well with multiparty sessions (situations where we use multiple csets).

klag commented 9 years ago

I think that the problem to design an access policy to a given resource is a problem which has to be managed at the level of the resource service. In our example is the chatService. I agree that there are no specific primitive so far for doing what you want, but in this case I prefer to investigate these kind of primitives, for example an extension of the synchronized primitive in order to manage locks better. Or maybe the introduction of a new primitive like waitFor

ex:

[ getHistory()() { waitFor( lock ) { } }

[ postMessage()() { synchronized( lock ) { } }

the waitFor just check if the lock has been taken, it waits in the positive case. It continues otherwise. synchronized can continue only if there are no other synchronized pending and other waitFor pending.

Otherwise an external service LockService written in Jolie could implement more complicated scenarios.

As far as the workflow is concerned, the client sees a getHistory operation that is always available and a postMessage that must be invoked sequentially.

From an architecture point of view you could obtain the same with: 1) a SessionManager which implements the createChat and the closeChat 2) a HistoryService which implements a concurrent getHistory 3) a MessageService which implements a sequential postMessage 4) a chatService which implements both getHistory and postMessage with access policies

The HistoryService and the MessageService are aggregated into one single service with a courier that tests the session_id every time there is a getHistory and a postMessage. In this case you do not need the provide-until primitive (this is why I like to have a light version of it instead of a complicated one)

the workflow at the level of the client is:

1) get a session token from the session manager 2) invoke getHistory or postMessage 3) closeChat

the workflow is spread over the architecture and there is not a single service which implements the workflow that is where choreographies can provide the big benefits: describing a network workflow where the workflow does not concretely exist in a single service.

fmontesi commented 9 years ago

Mmh I don't like waitFor too much. I don't even like synchronized too much. :-)

I would like something based on messages, maybe a generic block construct that guarantees that something gets done (e.g., a solicit-response) before and after a code block regardless of faults.

I think we need something like this because it can be extended. We risk to have a primitive for each concurrent policy which does not seem a nice prospect.

fmontesi commented 9 years ago

OK, here's a super proposal devised in collaboration with @thesave .

If we have #26 (parametric procedures) and a primitive for running code in parallel without waiting for it to finish, I can implement the scenario in a non-error-prone way like this and also support custom locking policies for our data services!

Let us suppose that we have such a primitive, fork (please suggest a better name if you have an idea):

define Reader(A,B)
{
     subscribeToReaderLock@RW( A ); readerLock()() { B }
}
define X
{
        [ read()( history ) { Reader( roomName, { fork X } ) } ]
        [ post( msg )() { Writer( roomName, { history += msg } ) ] { X }
        [ closeRoom() ]
}
main
{
createChatRoom( room );
history = "";
X
}

I omitted the definition of Writer.

In this way, we may define custom locking policies (and god knows what more!!) as parametric procedures and then make libraries out of them! For example, we may make a Java service for readers/writers lock and in its include file we would also put the definitions of procedures Readers and Writers.

Do we like fork and #26 ? ;-) I like them a lot actually, they do not change much and they look very powerful. :-)

klag commented 9 years ago

It is not clear to me why we need such a construct for doing the following:

main { [ read()( history ) { send a reader lock request to RW service ] [ post( msg )() { send a write lock request to a RW service ] [ closeRoom() ] }

RW service will be responsible for governing the access policies

fmontesi commented 9 years ago

I'm closing this for now as it seems like I can cover my use case of interest without it.