Open gavin-norman-sociomantic opened 5 years ago
@scott-gibson-sociomantic @don-clugston-sociomantic @daniel-zullo-sociomantic as the primary users of the neo clients, any thoughts on this?
in the case of fatal errors failure, no_node, node_error, wrong_node and unsupported
, is there any thing that would an application would do differently as a response for these cases? Could these be lumped in to a generic fatal_error
that provides more detail in the format notification?
As for retry behaviour, i'm unsure how to approach this as my applications have different requirements (some should always retry, some can retry a few times then decide not to continue). As long as these apps can have control over the behaviour then perhaps a simple request wrapper would suffice.
In either case i think we might need a greater sample size of usage until we can try to simplify.
in the case of fatal errors failure, no_node, node_error, wrong_node and unsupported, is there any thing that would an application would do differently as a response for these cases? Could these be lumped in to a generic fatal_error that provides more detail in the format notification?
Right, that could work. It'd require quite a bit of reworking of the way things are currently set up, but I think it'd present a nicer API to the end-user.
As for retry behaviour, i'm unsure how to approach this as my applications have different requirements (some should always retry, some can retry a few times then decide not to continue). As long as these apps can have control over the behaviour then perhaps a simple request wrapper would suffice.
Indeed, this is the tricky bit. I imagine that one-shot requests like this that could be retried would have some additional arguments to configure this behaviour. Something like this maybe:
DhtClient.Neo.Get.RetryConfig retry;
retry.max_retries = 3;
dht_client.get("channel", key, retry);
Right, that could work. It'd require quite a bit of reworking of the way things are currently set up, but I think it'd present a nicer API to the end-user.
Though... it'd mean a union inside a union, which might not be that fun to deal with.
In either case i think we might need a greater sample size of usage until we can try to simplify.
I agree with this point. We don't have a large enough usage sample yet to be able to make a good judgement. I just wanted to get some ideas going, in advance.
How about having the taskblocking call return success
on success (as is now) and a SmartUnion
on failure, so that user can react on only those events they are interested in:
void[] get_buf;
auto result = this.dht.blocking.get("channel", 0x1234567812345678,
get_buf);
if ( result.succeeded )
{
if ( result.value.length )
// Received the requested record.
else
// Key not found in DHT.
}
else {
with (result.error) switch (result.error) {
case node_disconnected:
schedule_retry(); break;
}
}
That would be an improvement to the current API, yes. It wouldn't address the point about most apps (probably) ending up implementing the same error handling logic, though, which is kind of my main problem with the current approach.
Thoughts on implementing this:
Scott;
in the case of fatal errors failure, no_node, node_error, wrong_node and unsupported, is there any thing that would an application would do differently as a response for these cases? Could these be lumped in to a generic fatal_error that provides more detail in the format notification?
These cases are not quite the same. Some of those, such as unsupported
indicated a deployment versioning problem. I would want to terminate the app in that case, it can only be fixed by installing new software somewhere.
But some of the others might be caused by hardware problems in one of the nodes or in a router. In that case the ISP or infrastructure team may fix it, so that it will recover automatically, without needing to restart the app. But even so, I think your idea of a fatal_error is a good one.
Gavin:
Real use cases will inevitably use the first, more complex API.
I don't think that's true. For apps like shore, where the data is not very valuable, the task blocking API is perfect. In general I'm wary of something like a framework that tries to simplify the most common kinds of requests. It's very easy for that sort of thing to be not-quite-flexible-enough.
You might want to consider defining a generic default behaviour for retrying a request and error handling given the user the flexibility to implement their own ones for special/corner cases. From the user perspective this could help avoiding code duplication and also reduce the focus in the logic of handling a request and mainly deal with the business logic on application/client side instead.
You might want to consider defining a generic default behaviour for retrying a request and error handling given the user the flexibility to implement their own ones for special/corner cases. From the user perspective this could help avoiding code duplication and also reduce the focus in the logic of handling a request and mainly deal with the business logic on application/client side instead.
Yeah, I certainly wouldn't envisage removing the current "full info, full power" API.
I don't think that's true. For apps like shore, where the data is not very valuable, the task blocking API is perfect.
Interesting. I didn't expect anyone to use the blocking API without the notifiers in anything but simple tests. The danger of it is that error logging will be very unhelpful.
Don, maybe we're getting mixed up. There are three ways of assigning many requests:
3 isn't recommended for use in real apps, but 2 is.
Some of those, such as unsupported indicated a deployment versioning problem. I would want to terminate the app in that case, it can only be fixed by installing new software somewhere.
Theoretically the node could be reverted or updated that would resolved the unsupported request. An aggregation app continue running aggregating data in that case.
(This issue possibly doesn't apply to swarm specifically, but might have to be implemented independently in each *proto. Writing it here initially, just to put it in a central location.)
The current recommended client usage looks something like this (taking the DHT client's Get request as an example):
This is of the "precise knowledge gives great control" variety of API: it provides the user with full information about what happened, allowing them to decide what to do, but does not implement any default behaviour.
On the other hand, we have the simplified,
Task
-blocking API, which looks like this:This is the opposite extreme: it only provides the user with ultra minimal information about what happened, and still doesn't implement any default error handling behaviour.
It seems to me that:
Something like this, for DHT Get: