Closed Alex-Cook4 closed 4 years ago
The first two - DynamicWriteData and Y-Import - sound appropriate. They sound like operators designed to managed the flow of tuples in an application.
But RedisJson sounds like something that is not "plumbing", as it's about pushing a particular kind of data to a particular kind of external storage. That may be more appropriate in streamsx.json.
That makes sense @scotts. Do the following namespaces sound good for the first two:
com.ibm.streamsx.plumbing.filters com.ibm.streamsx.plumbing.imports
@Alex-Cook4 Since we want to encourage use of Publish
/Subscribe
from the topology toolkit maybe the Y-Import should be in that toolkit?
Could you expand a little on DynamicWriteData
, its name doesn't match the description you provided, so I'm not exactly sure what it does.
@Alex-Cook4 FYI - I also hacked up a Redis operator, one that wrote each tuple into Redis, using Jedis. and one that read from Redis using Jedis.
@ddebrunner that's fair :-) It is basically a Dynamic Filter that is turned on and off based on a trace-level argument. We are only using it to filter right before a filesink, but it could definitely be more generalized. "DynamicFilter" might be a better name, although I don't want to infringe on a more official DynamicFilter down the road.
That's cool that you used Jedis...what was the reason to do that over using the dps toolkit?
I was using the Compose Redis service on Bluemix which DPS does not support.
TraceLevelFilter
? Assuming that the filtering is driven by the SPL trace level.
Yes, I like that. +1
If RedisJson
uses DPS then is it specific to Redis or could it be used with any key-value store that DPS supports?
Sounds like a candidate for the DPS toolkit though.
@ddebrunner can you elaborate more on why you think the Y-Import would go in the topology toolkit? Isn't the focus of that toolkit Streams in other languages?
Topology toolkit also provides the publish-subscribe model which is the easier to use approach to Import/Export. It seems that the optional import from a File should build upon Publish/Subscribe.
Though having thought about this, why do it that way at all (Y-Import), why not just Import/Subscribe
and just have a microservice application that reads from a file and Export/Publish
that stream. So that the application path being tested is the one that will be used at production?
That's a good point. The main reason comes from trying to develop a "standard import" at this customer, with the key focus being on making sure novice Streams developers import and run their data through a threadedPort operator that drops tuples to prevent a backlog in upstream jobs. We have also run into the problem where developers are changing their code to test.
Packaging it all into one piece that can be tested using a submission-time parameter serves our purpose from a simplicity perspective when trying to push adoption. I personally prefer the idea of the testing microservice and will think about it more for our customer situation.
We have tried both approaches - (a) having a single Y composite operator; (b) having a microservice that reads from a file and exports which gets subsequently imported by the application.
In fact, (b) would be preferred approach since the production application does not have unnecessary file source, directly scan etc. For senior developers in the team, (b) worked perfectly well but then we had to abandon it due to proliferation of dependencies/applications which some developers just could not handle resulting in unnecessary questions/headaches for senior developers.
Suppose we have 30 applications running - approach (b) requires 30 of these testing services/applications which have tuple type dependencies across applications (file source application and rest of the application). Unless packaging or naming of these pairs of applications is consistent and their location in source repository is well understood, it tends to become a big mess.
Note that even if there is some way of passing some tuple type, all applications cannot share a common test application as test application may include some logic to clean data. Having this Y composite gives people a good head start as they can include logic to clean data in Y composite.
@ddebrunner Would you lean towards the Y-Import being a part of the streamsx.topology or here in the plumbing?
If Y-Import
was really Y-Subscribe
then it should be in topology. If it's going to expose low-level Import
primitives, then it probably makes sense in plumbing, though I'm concerned we will end up with two competing high-level schemes based around dynamic connections, thus potentially confusing developers about which approach to take.
I could probably provide better input if there was a more complete description (e.g. is it in a form yet where SPLDOC can be generated?), as @dakshiagrawal adds that clean functionality can be included in it, but not sure how that will be done. It seems like the composite is trying to solve several problems (back-pressure, testing, data cleansing, standard import, throttling) so it potentially crosses several toolkits (e.g. should it be part of a testing toolkit once that gets off the ground?) as well as solving some issues the product is addressing differently.
@ddebrunner, have you put your Jedis Redis operators anywhere on GitHub? I'm curious to take a look.
I have several composites that we have found very useful across a large deployment at one of our large customers. Would this be the right place for them? If not, any idea where? They are: