Closed smnorris closed 5 months ago
Seems it is possible to generate the matrix to process the tiles in parallel, but I cancelled the proof of concept test run because it was not progressing... is it reasonable to make 89 concurrent requests to a single parquet file on object storage? Need to:
Still sluggish to get going but seems to work now that the files are properly formatted: https://github.com/bcgov/CE_integratedroads/actions/runs/8620376591/job/23627218010
Potential improvements:
ce-integratedroads
workflow into reusable workflows would be valuable - download; preprocess; final process. Mostly just for development/testing - there is no need to download/preprocess more than once unless something has changed
Try and run the full job using GHA runners.
Requires:
location to serve source and output files (object store)
reworking scripts to minimize data held in the db - ideas: