IBMStreams / administration

Umbrella project for the IBMStreams organization. This project will be used for the management of the individual projects within the IBMStreams organization.
Other
19 stars 10 forks source link

IBM Project EventStore repository and toolkit #119

Closed dzilio closed 7 years ago

dzilio commented 7 years ago

## Proposal I would like to propose that a new toolkit and repository be created to provide easy integration with IBM Project EventStore. IBM Project EventStore information can be found here: https://www.ibm.com/us-en/marketplace/project-eventstore

The toolkit will initially provide an "EventStoreSink" operator that will allow streamed data to be inserted in batches to an IBM Project EventStore database. The operator can be used in consistent regions and is flexible enough to create a table automatically if the table specified in the operator does not exist in the database. This "EventStoreSink" operator will allow a user to define connection information either through operator parameters or in an app config, and allows the batch size to be changeable through a parameter.

## Naming I propose the following names: • Repository: streamsx.eventstore • Toolkit: com.ibm.streamsx.eventstore

## Initial Contribution The toolkit will initially contain one operator to consume stream data to insert into an IBM Project EventStore engine: • EventStoreSink

ddebrunner commented 7 years ago

+1 on the project

-1 on the operator name, I'd prefer to see a name that reflects the operator being performed in terms of event store, e.g. is it inserting records, appending records, ????

dzilio commented 7 years ago

The operator can do a number of things: 1) if the table does not exist, then creates the table, THEN 2) if the table is empty, it inserts new rows 3) if the table has rows, it will try and insert new rows to the table It does not overwrite existing rows (I.e. it does not update rows nor delete rows)

ddebrunner commented 7 years ago

What's the difference between 2,3 ?

dzilio commented 7 years ago

There is no difference really. Only new rows are inserted. In EventStore, a user can define a primary key on a table (where we also include a parameter in the streams operator in which the user can define it for case (1) during the table creation by the operator). If a user were to insert a row with the same primary key then the duplicate row is rejected. In both (2) and (3) above, the "new" inserted rows implies that they are ones where their primary key does not exist already in the database. From an operator point of view, rows are merely sent to the EventStore database.

mikespicer commented 7 years ago

+1 Assuming operator naming gets sorted out

dzilio commented 7 years ago

Should I rename to EventStoreInsert as the operator name instead of EventStoreSink? Or are people ok with EventStoreSink?

ddebrunner commented 7 years ago

@dzilio Probably a discussion item once the repo gets created.

dzilio commented 7 years ago

ok

petenicholls commented 7 years ago

repos created...opened 1st issue regarding name of operator