mrchrisadams / hourly-carbon-intensity-usa

Scripts to make an sqlite database of hourly carbon intensity of every balancing authority in the USA
Apache License 2.0
2 stars 0 forks source link

Make the github action to republish this data as a daily job #1

Open mrchrisadams opened 1 year ago

mrchrisadams commented 1 year ago

We now have a script that generates a sqlite file, that you can easilty browse using datasette, and also a parquet file of all the readings.

Running this daily would be v helpful.

You can upload data as part of a github action to a local repo, until you have figured out where to put it on a more permanent basis.

https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts

name: Make db and parquet

on: [push] # nah, do this daily

jobs:
  build_and_:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3
      - name: pip install and run python script
        run: |
          pip install -r requirements.in
          python ./fetch_data_from_balancing_authorities.py
      - name: Upload generated data file
        uses: actions/upload-artifact@v3
        with:
          name: hourly-co2-usa.ztd.parquet
          path: output/hourly-co2-usa.ztd.parquet

Until we know where we want to make data available to download from, it's probably best to have the github action upload the just the generated parquet file, as with 83 balancing authorities, I'm guessing that the generated sqlite database file would be between 500 and 800 gb uncompressed.