ProcessNewGranules times out on subscriptions with many unprocessed granules

ASFHyP3 / hyp3

A processing environment for HyP3 Plugins in AWS.

BSD 3-Clause "New" or "Revised" License

37 stars 8 forks source link

When evaluating subscriptions, the ProcessNewGranules lambda prepares all of the unprocessed jobs for each subscription before submitting any of those jobs:

https://github.com/ASFHyP3/hyp3/blob/develop/apps/process-new-granules/src/process_new_granules.py#L66-L69

This becomes problematic for new subscriptions, especially InSAR subscriptions with search dates reaching into the past, that may have many unprocessed granules. Preparing the list of unprocessed jobs involves finding InSAR pairs for each unprocessed granule, which may take longer than the Lambda function's 15 minute timeout. In these cases the lambda execution is terminated without submitting any new jobs. Subsequent runs of the function end up in the same state, typically requiring manual intervention before any new jobs will be submitted for any subscriptions.

Two possible resolutions come to mind:

Refactor handle_subscription to submit jobs granule-by-granule instead of subscription-by-subscription, something like:

def handle_subscription(subscription):
for granule in granules = get_unprocessed_granules(subscription):
    jobs = get_jobs_for_granule(subscription, granule)
    dynamo.jobs.put_jobs(subscription['user_id'], jobs, fail_when_over_quota=False)
    #TODO exit loop if user has hit their quota?

Refactor get_jobs_for_subscription to fetch at most, say, 20 granules per subscription per run to cap the amount of time spent processing any one subscription:

def get_jobs_for_subscription(subscription, limit=20):
granules = get_unprocessed_granules(subscription)
jobs = []
for granule in granules[:limit]:
    jobs.extend(get_jobs_for_granule(subscription, granule))
return jobs

ASFHyP3 / hyp3

ProcessNewGranules times out on subscriptions with many unprocessed granules #794