Closed felipegutierrez closed 5 years ago
Not exactly. We need a given workload that makes the stream application to complete after 20 minutes, without any parameter. This can be done by creating an input dataset that when streamed over to the application will take 20 minutes to complete execution. Note that if you make application code optimizations, this time may be kept constant, reduced, or increased.
I am not sure about this. Stream applications have the nature to run infinitely. The workload can be finite, but the stream application will keep listening to new data on the source. So, I guess we need a finite workload maybe? and we raise a flag when the workload finishes. Does it make sense?
My understanding is that stream applications do not necessarily run "infinitely" (as in "forever"), but rather that they process data dynamically. This can be observed if one thinks about a sensor (source) which is not generating any data for processing, perhaps because the sensor is off, or because there is nothing to be detected, and in which case the stream application would be idling, i.e., not effectively running because there is no data to process.
So yes, we need a finite dataset workload for finite processing because our experiments will be finite. The dataset should have an "END_OF_DATASET" tuple in the end of it, which will indicate to the application the end of the streaming data, making it to idle.
I provided what is described on this issue and also regarding the discussion to understand better the requirement. Use the script bash conf/launchApp.sh
to see all the instructions to use the producers and consumers. It also shows examples of how to launch the applications.
We need to set a parameter when launch the stream application to be executed in s specific given time. Let's say we want to execute it for 20 minutes. We just need to pass this argument when launch the stream application.