Fixed errors preventing some releases from building (340, 3130)
Upgrade the state machine pipeline with an option to use existing build artifacts from previous executions if available (use_existing_build=true)
Refactored the validation step to run inside the build stage on single releases instead of running after on multiple releases
Added option to skip the load process, in case only the build artifacts are needed (skip_load=true)
Error handling during build is improved
Exit code of 1 indicates critical failure and causes build to fail
Exit code of 2 indicates non-critical failure, for example when some alleles fail to build. Build can still succeed.
Errors during build are output to <data_bucket>/data/<release>/errors/errors.ndjson for later analysis
Failed Alleles queue is removed for now as it doesn't support debugging as well as the error output
Usage
use_existing_build=true will look for existing CSV files and load these. If there are no CSVs for the release then they will be created.
skip_load=true will run only the build stage and will skip loading. This is useful when just the CSVs are needed.
# Example for single version
STAGE=<stage> make database.load.run releases="3510"
# Example for multiple versions where only 3510 has already been built
# 3490 and 3500 will be built, 3510 will use existing CSVs
STAGE=<stage> make database.load.run \
releases="3490,3500,3510" \
use_existing_build=true
# Example of how to build all releases and skip loading
STAGE=dev make database.load.run releases=300,310,320,330,340,350,360,370,380,390,3100,3110,3120,3130,3140,3150,3160,3170,3180,3190,3200,3210,3220,3230,3240,3250,3260,3270,3280,3290,3300,3310,3320,3330,3340,3350,3360,3370,3380,3390,3400,3410,3420,3430,3440,3450,3460,3470,3480,3490,3500,3510,3520,3530 skip_load=true
Description
use_existing_build=true
)skip_load=true
)<data_bucket>/data/<release>/errors/errors.ndjson
for later analysisUsage
use_existing_build=true
will look for existing CSV files and load these. If there are no CSVs for the release then they will be created.skip_load=true
will run only the build stage and will skip loading. This is useful when just the CSVs are needed.Next Steps