Interval around 40 seconds, actually quite a while. What is that metadata trigger? We should able to look at which metadata.json triggered each, closer lookout their cloudWatch:
23:36:13 first 09429cf6-8fc4-47b9-9cef-e3681a55434c triggered by daily-headlines/2021-08-20T22:34:47Z/metadata.json
23:36:57 second b16ec1d8-0868-4adc-9f63-63489ad36d69 triggered by daily-headlines/2021-08-20T23:13:25Z/metadata.json
Recall our cronjob has Saved landing page metadata to s3://**/daily-headlines/2021-08-20T22:34:47Z/metadata.json, so that's the first one no problem.
The question is why the second get triggered. We may look at its events and figure out why the trigger:
Its LANDING_METADATA_DONE is evented at 2022-09-30T23:36:55Z. This is the cronjob outcome.
Cronjob log. Test invoke 7c2bce03-2c84-4126-8746-04938ddf103f done at --:36:11 -> generated metadata.json -> metadata trigger at 23:36:13. Make sense.
--:36:46 cronjob invoked again 9eaf1695-aeb2-4338-8871-43391530929f. Why? This is not our manual Test invoke. Only interpretation is our cronjob rate(40min) is up and happened to trigger here.
LANDING_METADATA_REQUESTED is evented at 2022-09-30T22:44:52Z, which will always happens the same time with LANDING_PAGE_FETCHED.. so it means landing page trigger invoked, means landing page fetched. Irrelevant to our context here we can exclude it.
Turns out cronjob may just only invoked once.
But the metadata S3 trigger called twice:
Interval around 40 seconds, actually quite a while. What is that metadata trigger? We should able to look at which
metadata.json
triggered each, closer lookout their cloudWatch:23:36:13
first09429cf6-8fc4-47b9-9cef-e3681a55434c
triggered bydaily-headlines/2021-08-20T22:34:47Z/metadata.json
23:36:57
secondb16ec1d8-0868-4adc-9f63-63489ad36d69
triggered bydaily-headlines/2021-08-20T23:13:25Z/metadata.json
Recall our cronjob has
Saved landing page metadata to s3://**/daily-headlines/2021-08-20T22:34:47Z/metadata.json
, so that's the first one no problem.The question is why the second get triggered. We may look at its events and figure out why the trigger:
**/daily-headlines/2021-08-20T23:13:25Z/landing.html
, uuid18acaf47-bba9-4468-9486-0adaa35ca457
.LANDING_METADATA_DONE
is evented at2022-09-30T23:36:55Z
. This is the cronjob outcome.7c2bce03-2c84-4126-8746-04938ddf103f
done at--:36:11
-> generatedmetadata.json
-> metadata trigger at23:36:13
. Make sense.--:36:46
cronjob invoked again9eaf1695-aeb2-4338-8871-43391530929f
. Why? This is not our manual Test invoke. Only interpretation is our cronjobrate(40min)
is up and happened to trigger here.LANDING_METADATA_REQUESTED
is evented at2022-09-30T22:44:52Z
, which will always happens the same time withLANDING_PAGE_FETCHED
.. so it means landing page trigger invoked, means landing page fetched. Irrelevant to our context here we can exclude it.