IBMStreams / streamsx.dps

The IBMStreams/streamsx.dps GitHub repository is home to the Streams toolkit named DPS (Distributed Process Store)
http://ibmstreams.github.io/streamsx.dps
Other
4 stars 9 forks source link

Composites for DPS putTTL and getTTL #50

Closed Alex-Cook4 closed 7 years ago

Alex-Cook4 commented 7 years ago

I have developed some SPL composites that wrapper the DPS functions into a DpsPutTTL Composite and a DpsGetTTL Composite for the direct needs we have at customer engagement. The main additions here being extensive metrics and a reconnection policy. I propose that these become part of the DPS toolkit in a com.ibm.streamsx/dps/util or com.ibm.streamsx/dps/wrappers namespace. I am attaching a screenshot of the SPLDOC for each operator.

For the composite names, I propose:

  1. DpsPutTTL
  2. DpsGetTTL

The danger of these names would be if there were future composites/operators that could fall under the same category. Do people think this is likely? It seems to me that adding the extra functionality (metrics and reconnection policy) wouldn't be consistent with other toolkits where those are just considered minor features.

dpscomposites

ddebrunner commented 7 years ago

Are these new operators specific to Redis, or could they work with any key-value store? Just I thought the toolkit was able to support more than just Redis.

ddebrunner commented 7 years ago

Why not use the same namespace as the function dpsGetTTL. The primary purpose of the operator is the same as dpsGetTTL so it seems com.ibm.streamsx.dps.distributed is the logical place.

If it's specific to redis then maybe the namespace should include redis.

Alex-Cook4 commented 7 years ago

I have only used it for Redis, but it should work fine with any DPS supported platform since I'm just using generic dps commands.

That would make sense to have the composites with the functions.

brandtol commented 7 years ago

The composites will work with all DB layers, but the semantics will be different. For example in the Cloudant layer the isConnected() call always returns true (no matter what/if there is any connection state). I think for the time being that is fine, and we can add functionallity to other layers if the need comes up. Maybe we should document it, and/or write a trace message.

brandtol commented 7 years ago

There is already an inconsistency in dps toolkit regarding the namespaces, at least as far as I see (or maybe I dont understand the reason for this):

The dpsGetTTL function lives in com.ibm.streamsx.store.distributed namespace, note that "dps" is missing. We probably cannot change that as applications use it. But for new functions I would suggest we use something that includes the toolkit name, like com.ibm.streamsx.dps.util.XXX or similar.

Otherwise I would agree that the same namespace as dpsGetTTL would be fine.

ZollnaPa commented 7 years ago

I suggest the SPL wording: Source & Sink like: DpsTTLSource & DpsTTLSink

nysenthil commented 7 years ago

Put and Get are commonly used terms associated with many NoSQL databases. In addition, DPS toolkit allows many other actions targeted at the configured NoSQL database beyond just Put and Get. It is better to leave Put and Get as it is.

ddebrunner commented 7 years ago

I also think the DpsGetTTL is not a source operator, but an enrich operator.

Alex-Cook4 commented 7 years ago

There are a bunch of varying opinions, but let me try a fresh proposal:

I propose that the composites have the same name as the functions. DpsGetTTL composite -> dpsGetTTL function DpsPutTTL composite -> dpsPutTTL function There is nothing special about these composites by adding metrics and reconnection, so I don't see a reason for adding that functionality to the name.

As for namespace, to be consistent I propose that we have the composites in the same namespace as the functions, which is com.ibm.streamsx.store.distributed.

@ddebrunner is that the namespace you meant with this comment?

The primary purpose of the operator is the same as dpsGetTTL so it seems com.ibm.streamsx.dps.distributed is the logical place.

@brandtol I also find it a bit confusing that the namespace doesn't include the full toolkit name, but I worry that it's also a little confusing to add another unique namespace.

I'm open to further conversation and changes to my proposal, this is more to try to get a decision finalized.

ddebrunner commented 7 years ago

@Alex-Cook4 Yes - I'd assumed the namespace used 'dps'.

brandtol commented 7 years ago

Lets try a vote, I see 3 options :

  1. com.ibm.streamsx.store.distributed.DpsGetTTLWithReconnect
  2. com.ibm.streamsx.store.distributed.DpsGetTTL
  3. com.ibm.streamsx.dps.util.DpsGetTTL

1)+ 2) have the advantage to not introduce a new namespace. 2) has a the problem IMHO that it only differs by one character from the normal native function. All options seem not optimal, but I would go with 1) 3) 2) in that order...

natashadsilva commented 7 years ago

@Alex-Cook4 I don't see the time to live as a parameter - is that because the time is expected as a parameter on the input tuple?

nysenthil commented 7 years ago

My preference is option 1: com.ibm.streamsx.store.distributed.DpsGetTTLWithReconnect

Alex-Cook4 commented 7 years ago

@natashadsilva Yes, it is expected as an attribute on the input tuple...which leads to the point that I should allow the use to specify the incoming ttl attribute.

Alex-Cook4 commented 7 years ago

I vote option 2. My reason being that I don't think composites and functions clash. They're quite separated in the documentation and within code used in very different scenarios. Googling for them will lead to a clash.

I don't feel very strongly about my option 2 preference, so will move forward with whichever option has the most votes on Monday.

Alex-Cook4 commented 7 years ago

I have opened pull request #59 to contribute the composites to the development branch of the toolkit. I will close this issue once we have integrated.

Alex-Cook4 commented 7 years ago

Completed. Thank you for the input everyone.