After having a few $20 days, we've decided to limit our exposure to unexpected data egress fees by turning on Requester Pays for the storage buckets containing the pudl-catalog.
[x] Enable requester pays on gs://intake.catalyst.coop
[x] Make the tests supply a billing project so they can access the storage buckets.
[x] Check that we're only downloading a minimal amount of data (a couple of state-years) in the tests.
[x] Update the example notebook to use requester_pays and new dd.read_parquet() args.
[x] Provide documentation / links for setting up a user billing project in the README.
After having a few $20 days, we've decided to limit our exposure to unexpected data egress fees by turning on Requester Pays for the storage buckets containing the
pudl-catalog
.gs://intake.catalyst.coop
requester_pays
and newdd.read_parquet()
args.