bcgov / nr-rfc-grib-copy

Demo of using github actions to copy / process raw data
Apache License 2.0
2 stars 2 forks source link

Reduce execution time #7

Closed franTarkenton closed 1 year ago

franTarkenton commented 1 year ago

Currently this script is configured to run synchronously. Configuring the following processes to run in parallels will significantly reduce the amount of time required to complete this process

This is an important component to support freshet as there is an operational desire to pull this data hourly vs daily.

This is the first ticket of two, this one will focus on conversion from synchronous processing to asynchronous.

Anther ticket will tie in with the Datamart AMQP (Advanced Message Queue Protocol) creating a listener and having that listener trigger the processing of this data immediately after the data becomes available.

franTarkenton commented 1 year ago

https://github.com/bcgov/nr-rfc-grib-copy/pull/9

Modified to do the downloads and the grib2 data processing in parrallel speeding up the time required to download and process the CMC data

to be completed! configure the object store copy to run in parrallel also

franTarkenton commented 1 year ago

Added async uploads to object storage, end result, with all the changes is the entire process is about twice as fast. Results of various runs here: https://github.com/bcgov/nr-rfc-grib-copy/actions

PR with changes: https://github.com/bcgov/nr-rfc-grib-copy/pull/10