Closed pnadolny13 closed 2 years ago
In GitLab by @DouweM on Oct 13, 2020, 13:57
@tomelliff Per https://github.com/singer-io/getting-started/blob/master/docs/CONFIG_AND_STATE.md#special-fields, "start_date
should be used on first sync to indicate how far back to grab records", which doesn't say anything about subsequent syncs, but I think it makes sense to consistently treat it as a lower bound on "how far back to grab records", even if the state suggests that we have in the past gone back earlier.
I'd gladly accept a contribution with this change!
In GitLab by @tomelliff on Oct 14, 2020, 07:36
mentioned in commit tomelliff/tap-gitlab@1a45e52283593208f35a90426265b947070a603b
In GitLab by @tomelliff on Oct 14, 2020, 07:37
mentioned in merge request !28
In GitLab by @DouweM on Oct 14, 2020, 11:15
assigned to @tomelliff
In GitLab by @DouweM on Oct 27, 2020, 14:38
mentioned in commit c0f729e8deeed0aade015139870cc800411d6280
In GitLab by @tomelliff on Oct 8, 2020, 21:21
I'd like to be able to use the state between runs to minimise how much data we pull back from Gitlab.
Unfortunately the heaviest part for us is tracking back through to the first
created
status job as this is non terminal. Ultimately this will nearly always track back to one of the first few pipelines for some of our projects because we havemanual
jobs that are blocking other jobs that are then left in thecreated
status.To get around this I'm dynamically setting the
start_date
by running the following command first:This then means we only track back a month on incremental runs when there is no state. However this doesn't work when there is state because it will use the state instead and only falls back to
start_date
when the state is empty for the entity.I'd like to propose the following change:
This feels potentially a little bit controversial to me for some reason but if someone states a
start_date
that is more recent than the state is tracking it is likely because they are deliberately trying to override the state being too old for an entity. I might be missing a use case here though.