_It is possible, although rare, that when not using load_tstamp as the session_timestamp and the event has a long delay (greater than the lookback window) before loading, and this was the first event of a session, that the start_tstamp of that session may actually be greater than the true start_tstamp of the session (based on that late loading event). In this situation, because we currently only filter events_this_run on the sessionid and min start overall (not per session) it is possible this event would be included on some runs and excluded on later runs.
_To make sure this event is deterministically excluded from events_this_run, I have added a filter to ensure the events for that session are at least greater than or equal to the start of that session as we have it in the lifecycle table._
_If the user uses load_tstamp for session_timestamp this is not relevant and the issue would not occur anyway._
The above was the original plan for this PR, it now also:
fixes a bug in the tests where we use a hard-coded timestamp type rather than the one from the variable used elsewhere
fixes a bug where the manifest table will always be updated with collector timestamp, I "fixed" this by using a variable as the argument, I don't like this, it should just be an argument and should be updated everywhere it's called, but if we do this then we have to do another web release and this way we can avoid that so....
Checklist
[ ] I have verified that these changes work locally
[ ] I have updated the README.md (if applicable)
[ ] I have added tests & descriptions to my models (and macros if applicable)
Description & motivation
_It is possible, although rare, that when not using
load_tstamp
as thesession_timestamp
and the event has a long delay (greater than the lookback window) before loading, and this was the first event of a session, that thestart_tstamp
of that session may actually be greater than the truestart_tstamp
of the session (based on that late loading event). In this situation, because we currently only filterevents_this_run
on the sessionid and min start overall (not per session) it is possible this event would be included on some runs and excluded on later runs._To make sure this event is deterministically excluded from
events_this_run
, I have added a filter to ensure the events for that session are at least greater than or equal to the start of that session as we have it in the lifecycle table.__If the user uses
load_tstamp
forsession_timestamp
this is not relevant and the issue would not occur anyway._The above was the original plan for this PR, it now also:
Checklist