This fixes #350 and some of #357. Masking isn't supported in this PR as it's too long as it is. Most important changes:
Job gets new fields for parameters, steps and kwargs. Steps is a dict of steps such as 'export' and 'aggregate'. Each value is a dict of arguments for the step in question. Currently supported workflows are:
No steps; job completes after fetch & process.
export.
export, then aggregate.
kwargs is a reserved field for future expansion, to let jobs have keyword arguments that are job-wide instead of specific to a step.
Added meaningful validation for various fields in various tables; it's not comprehensive but it's a start. Note this meant replacing update_status calls.
Added a custom dict literal field type to isolate all the repr and eval calls to a single place, and use consistent validation errors that we can trap for later on.
Configurable spatial chunking for PPJs via the PP_SPATIAL_CHUNK_SIZE setting.
The DH works with sqlite now; the major needed change was a postgres-only distinct() call.
Change PPJ submittal and working function: Remove job PK from PPJ args because it's duplicate information since the job is a foreign key on PPJs. Send PPJ IDs to the worker task via queue.submit; stop sending chunked arguments because the PPJ arguments are getting saved in the database already. Catch exceptions and fail corresponding PPJs.
Refactor job completion check into its own function since the scheduling function for PPJs is too long as it is, and remove 'scheduled' from the check for task failure as it's a race condition with tasks freshly submitted earlier in the same scheduler run.
This fixes #350 and some of #357. Masking isn't supported in this PR as it's too long as it is. Most important changes:
Job gets new fields for parameters,
steps
andkwargs
. Steps is a dict of steps such as 'export' and 'aggregate'. Each value is a dict of arguments for the step in question. Currently supported workflows are:kwargs is a reserved field for future expansion, to let jobs have keyword arguments that are job-wide instead of specific to a step.
Added meaningful validation for various fields in various tables; it's not comprehensive but it's a start. Note this meant replacing
update_status
calls.Added a custom dict literal field type to isolate all the
repr
andeval
calls to a single place, and use consistent validation errors that we can trap for later on.Configurable spatial chunking for PPJs via the
PP_SPATIAL_CHUNK_SIZE
setting.The DH works with sqlite now; the major needed change was a postgres-only
distinct()
call.Change PPJ submittal and working function: Remove job PK from PPJ args because it's duplicate information since the job is a foreign key on PPJs. Send PPJ IDs to the worker task via queue.submit; stop sending chunked arguments because the PPJ arguments are getting saved in the database already. Catch exceptions and fail corresponding PPJs.
Refactor job completion check into its own function since the scheduling function for PPJs is too long as it is, and remove 'scheduled' from the check for task failure as it's a race condition with tasks freshly submitted earlier in the same scheduler run.