Open anthonyfok opened 3 years ago
(inside brackets are the rough time for compression with xz -9
on a gen-3 Intel Core i5.)
xz -lvv <xz-file> | grep -Eo '[0-9a-z]{64}'
OpenDRR/opendrr-api/python/add_data.sh currently fetches some historic CSV files that may have already been deleted in HEAD. grep -B1 '?ref' opendrr-api/python/add_data.sh
gives a list of them:
fetch_csv model-inputs \
exposure/census-ref-sauid/census-attributes-2016.csv?ref=ab1b2d58dcea80a960c079ad2aff337bc22487c5
--
fetch_csv model-inputs \
exposure/general-building-stock/documentation/collapse_probability.csv?ref=73d15ca7e48291ee98d8a8dd7fb49ae30548f34e
--
fetch_csv model-inputs \
exposure/general-building-stock/documentation/retrofit_costs.csv?ref=73d15ca7e48291ee98d8a8dd7fb49ae30548f34e
--
fetch_csv model-inputs \
natural-hazards/mh-intensity-ghsl.csv?ref=ab1b2d58dcea80a960c079ad2aff337bc22487c5
Tasks
fetch_csv
function~ Add newfetch_csv_xz
function in OpenDRR/opendrr-api/python/add_data.sh to download from these compressed reposfetch_csv
function asfetch_csv_lfs
fetch_csv
function to callfetch_csv_xz
and fallback tofetch_csv_lfs
Description
Git LFS file download failure (Issue #90) might have been caused by we running out of our GitHub monthly bandwidth quota, especially with my frequent run of
docker-composer up --build
anddocker-composer down -v
in recent days.Create compressed equivalents of LFS repos, e.g. model-inputs → ~model-inputs-gz or~ model-inputs-xz, etc. (2021-05-10 update: xz is chosen for its SHA-256 sum feature which matches
oid sha256
entries in Git LFS pointer files.)~Or perhaps use our B2 or S3 bucket? (populate manually or using GitHub Actions)~ ~Or can some kind of HTTP proxy be used? Anyway to use B2 or S3 for such a proxy?~ 2021-05-10 update: Downloading directly from https://raw.githubusercontent.com/ seems fast enough, so the use of buckets might not be necessary.
And what about local cache?