Closed rcavaliere closed 1 year ago
@clezag here the paper with some explanations on the algorithms Paper.pdf
An update on the situation: The current elaboration is not set up in a way that allows for actual real time data.
The job runs every hour at exactly xx:00 and then recalculates all the time frames (10min, 30min etc.) of the last 24 hours. This takes a good while, so much in fact, that sometimes the job takes more than 1 hour and skips it's next run, leading to even more out-of-date data.
We are currently testing some query performance optimizations that should bring elaboration time down to something more reasonable (~20 minutes).
While this will ameliorate the issue somewhat , it still doesn't produce real-time data, as it will still always be out of date by 20-60 minutes. To get actual real-time, we would have to rework the elaboration in a major way. I would suggest discussing our options in person, if we want to go that route, as there are many different levels of compromise we can choose from.
Next steps consolidated with @clezag and @dulvui:
@rcavaliere The index and query optimizations are now live in production
Seems like we have a never ending story on our hands. Now that the performance fix is in production (and works - we're from 1h+ down to 25min), I've noticed that some stations are still very much out of date (like 2+h). Turns out that most of the time we don't get any data for a bluetooth box for quite a long time, and are then sent the history up to that point all at once.
If the job doesn't find any data for a period, it doesn't generate the measurement. For example if the last record is at 06:30, that node will be stuck on that time even though the elaboration ran without problems at 10:00.
Updates come in every 10 minutes, but they only include a few stations at a time (~10). I don't see any pattern as to which ones get updated more often or when. Maybe there is some issue at the data provider or with the network to the bluetooth stations?
@rcavaliere Since you know the project quite well, do you have a possible explanation from the data provider side?
I'd say if we don't get this sorted out first, there is little value in evaluating changes to the elaboration / scheduler.
@clezag I suggest to close this issue. We made already a relevant improvement, further developments should be evaluated out this user story. Thanks a lot for your work!
It is related to this elaboration: https://github.com/noi-techpark/bdp-elaborations/tree/main/bluetooth-traffic
The task to be done is similar to https://github.com/noi-techpark/bdp-commons/issues/568, at present the elaborations have a delay of hours which makes this information not usable for real-time applications