[rfc 119] Best practices for remote vs. local references to svcs

bjhargrave commented 16 years ago

Original bug ID: BZ#697 From: tim.diekmann@siemens.com Reported version: R4 V4.2

bjhargrave commented 16 years ago

Comment author: tim.diekmann@siemens.com

Existing clients that are not aware of remote capabilities will most likely not be able to properly deal with exceptions thrown because of the distributed nature. Address the comments in the document by writing a section in a best practice chapter to discuss how clients and deployers should handle this situation.

bjhargrave commented 16 years ago

Comment author: Richard Hall <richard.s.hall@sun.com>

I wasn't part of the discussion for this issue, so perhaps I am missing the background to comment, however, I am not sure that much extra clarification is necessary. The dynamism in OSGi services results in errors that are very similar to the errors you get in distributed computing. Thus, any time OSGi code touches a services, it should be prepared to deal with the fact that an exception can be thrown due to dynamism. So the best practices for both local and remote should be similar.

bjhargrave commented 16 years ago

Comment author: Jan S. Rellermeyer <rellermeyer@inf.ethz.ch>

IMHO, best practice for remote services is in fact to distinguish between (e.g., checked) exceptions thrown by the service (which would most likely as well happen locally), and those which arise only from the fact that we have introduced a network between the caller and the callee, i.e., communication errors. Whereas the first kind is typically serialized and rethrown by the proxy to mimic a consistent behavior, the second category is typically mapped to a (single) RuntimeException. By definition, RuntimeExceptions can happen at any time (e.g., due to bugs like dereferencing a null pointer or dividing by zero) and any Java program has to be prepared, or ignore them and risk that the VM terminates. Therefore, even though it makes sense to distinguish communication errors from "ordinary" exceptions, they don't really change the requirements for a client in terms of exception handling.

bjhargrave commented 16 years ago

Comment author: Richard Hall <richard.s.hall@sun.com>

Agreed. That was pretty much my point. There isn't much any client is going to be able to do in this case, especially clients that are not aware that the service might be remote in the first place. Thus, the best practices are the same for any OSGi service where stuff can go away and start throwing exceptions for some reason. However, if someone wants to find some way to distinguish between remote and local exceptions, I think that is fine, but it seems orthogonal to the issue being raised here.

bjhargrave commented 16 years ago

Comment author: tim.diekmann@siemens.com

It does make a difference whether the exception was thrown by the business logic or the communication infrastructure. In case of a communication problem, you cannot be sure about the success of you service invocation. A retry may be adequate, since the response was not from the service, but the infrastructure. The best practice section will address the fact that there is a difference and an application may choose to know about it. It may also choose to ignore the communication infrastructure detail. It would be a fallacy to assume that both cases are equal.

bjhargrave commented 16 years ago

Comment author: Richard Hall <richard.s.hall@sun.com>

The original description talks about clients unaware of remoting, so I don't see how the distinction will help them at all. However, I agree that remote-aware clients would benefit from such a distinction. Regardless, all clients in OSGi must be aware that they can get an exception any time they invoke methods on a service interface.

bjhargrave commented 15 years ago

Comment author: eric.newcomer@iona.com

I was reading this to try to update the document, but it's hard to draw a conclusion from the discussion.

It seems to depend upon whether or not the client is aware its invoking a remote service.

We have agreed there's no such thing as transparent distribution, since a service exhibits different behavior (or can) when a network is involved in the invocation (as opposed to invocation within the same address space).

However I am not sure we have achieved any agreement on the way in which that difference might manifest itself in additional or existing exception or exceptions.

As Jan says, the runtime exception can occur at any time and so isn't distinguishable by itself as to whether it was generated remotely or locally.

We have added an exception to indicate a problem in the "mapping layer" - i.e. the code implementing the deployment of a DSW onto an OSGi framework.

So maybe the only thing to recommend is to write remote aware clients that can try to recognize a runtime exeception as possibly due to a network issue (and perhaps test for it) and to check for an exception generated from the deployment code?

bjhargrave commented 15 years ago

Comment author: Graham Charters <charters@uk.ibm.com>

Richard's most recent comment seems to point to a bit of confusion in the original description. The original description implies that this is about clients who are not aware of remote services (and perhaps it is), but it could also be about how to design clients and services with distribution in mind, so there are probably two "best practices" to describe:

where the client is not aware of remote services. Here, I think the best practice is around how to mitigate risk through deployment (configuration of distribution software and discovery services).
where the client and service implementation are aware of remote services. Here, I think the best practice is around handling network failures or being aware of latency issues. There's also the call semantics, where we say that in the absence of any intents for passByValue or passByReference, the service should function the same when called with either semantic. This makes it more robust to distribution and enables the widest range of protocols/distribution softwares.

Does this make sense?

bjhargrave commented 15 years ago

Comment author: Graham Charters <charters@uk.ibm.com>

I suggest the following wording to cover best practices resulting from the introduction of Distributed OSGi. If I don't hear any complaints I'll add it to 119 in the next day or so.

Best Practices and Distributed OSGi Services

The vast majority of OSGi bundles which consume services have been designed for local service invocations. With the introduction of Distributed OSGi it is important to consider whether or not a client bundle should be able to obtain a remote service. There are two options available to avoid accidentally resolving to a remote service:

Ensure the Distribution Software and/or Discovery Service are configure to avoid remote services being made available in the client's service registry.
Install a filter hook to filter out results for remote services. RFC 126 (Service Registry Hooks) and the requirement of a Distribution Software to add the service property osgi.remote=true to all remote proxies enables this to be done.

When designing a service or client for distributed operation, there are a number of factors to consider:

Communications failures: ensuring proper handling of failures and uncertain outcomes, such as a requested reaching a service and being acted upon, but the response never reaching the client. Consider using a reliable transport for business-critical remote communications.
Latency issues which may lead to requests taking longer to complete, and in some cases could result in requests arriving out of sequence. Client and service implementation design can take this into account, and if necessary, a transport can be chosen which provides the desired QoS, such as assured message ordering.
Invocation semantics of a remote service are likely be passByValue, whereas OSGi local services are passByReference. If a service is likely to be distributed, consider designing and implementing it to be agnostic of the call semantics as this will maximize the opportunity for the service to be re-used.

bjhargrave commented 15 years ago

Comment author: @bosschaert

Looks good to me, Graham!

bjhargrave commented 15 years ago

Comment author: schnabel@us.ibm.com

Ensure the Distribution Software and/or Discovery Service are configure to avoid remote services being made available in the client's service registry.

Shuld be 'configured', not 'configure'.

Communications failures: ensuring proper handling of failures and uncertain outcomes, such as a requested reaching a service and being acted upon, but the response never reaching the client. Consider using a reliable transport for business-critical remote communications.

You're missing a word in there somewhere... a requested what reaching a service? or did you mean the request itself?

bjhargrave commented 15 years ago

Comment author: Graham Charters <charters@uk.ibm.com>

Ensure the Distribution Software and/or Discovery Service are configure to avoid remote services being made available in the client's service registry.

Shuld be 'configured', not 'configure'.

Communications failures: ensuring proper handling of failures and uncertain outcomes, such as a requested reaching a service and being acted upon, but the response never reaching the client. Consider using a reliable transport for business-critical remote communications.

You're missing a word in there somewhere... a requested what reaching a service? or did you mean the request itself?

Thanks, Erin. The first mistake was missing a 'd' and the second had one 'd' too many. I guess on the plus side, I had all the right characters, just not in the right places ;-)

bjhargrave commented 15 years ago

Comment author: eric.newcomer@iona.com

Looks like the text has been added to the doc. Closing the bug.

osgi / bugzilla-archive

[rfc 119] Best practices for remote vs. local references to svcs #596