Closed bithw1 closed 5 years ago
Hello and thanks for your point. I've just pushed a fix for an assertion error thanks to it.
And to answer to your question. I've observed this kind of behavior and it's the reason why I don't want to assert on strict time processing difference. IMO the behavior is the race condition between stream ingestion and query execution beginning. When the query begins its real execution it already has 2 records to process. And Spark processes them as micro-batches with 2 seconds of difference. If you execute this code for much longer than 30 seconds and you print processingTimes
you'll see that the difference is always something between 4 and 5 seconds for all records except 2 first, like here:
processing times ListBuffer(3, 4, 4, 5, 4, 5, 5, 4, 5, 4, 5, 4, 5, 4, 5, 5, 4, 4, 5, 5, 4, 4, 4)
Thanks @bartosz25 for the helpful answer, I understood.
Hi, @bartosz25
In the TriggerTest.scala, I have a question that I don't understand. I print the Container.processingTimes on the console.
I would ask why the second processing time 1534650798638 is only about 2 seconds larger than 1534650795528, I think it should be about 5 seconds