Closed nysenthil closed 7 years ago
I'm surprised by the format of the second output port, it being a list of tuples. The comments talk about the operator potentially having over 100,000 tuples, that's just not going to work as a single output tuple, ie. a list of 100,000 nested tuples. The size limit on the output tuple is likely to be blown.
From the sample I'd assumed it would send out all the tuples as individual tuples, followed by a window mark.
Primary task of the ExpiredTupleDetector operator is to hold the tuples arriving on port 0 in memory for a configured amount of seconds and then evict them when that configured time expires.
First sentence should be a short crisp definition of the operator, this seems verbose.
Have to say I'm still trying to figure out a crisp definition, it doesn't really detect anything, it seems to be a smarter version of the Delay operator. I can see that with the use of deletes then it can be used to process unacknowledged items after expiration.
Thinking about it more, I think separate ports might make more sense, at least the delete port is a control port and I can see there being a feedback loop from the second output port to the delete port. E.g. analyze all existing tuples in the operator and delete some of them based upon some criteria, in line with use case 1.
For the input ports I wonder if it makes more sense to have four input port sets each with a cardinality of 1, or as it is now. With the separate ports each port would clearly get its own description.
Simply send a tuple via this input stream with that tuple's first attribute carrying a value needed to identify the data tuple to be deleted from this operator's internal in-memory data structure.
How does this single attribute identify a data tuple? (same question for the delete).
The requirement for it to be the tuple's first attribute seems somewhat arbitrary, might mean I have to have a Functor somewhere to reorder an existing schema.