acep-uaf / aetr-web-book-2024

Alaska Electricity Trends Report as a web book
https://acep-uaf.github.io/aetr-web-book-2024/
Creative Commons Attribution Share Alike 4.0 International
0 stars 0 forks source link

Data page or Zip file to make data downloads prominent #18

Closed jikaczmarski closed 2 months ago

jikaczmarski commented 3 months ago

We would like to see a page where one could access the data right away.

ianalexmac commented 3 months ago

Perhaps we could keep data in SQLite .db format, then run queries to build CSV downloads.

ianalexmac commented 3 months ago

@dayne do we point data downloads towards https://github.com/acep-uaf/ak-energy-statistics-2011_2021 ?

eldobbins commented 3 months ago

Converting this to an Epic because we need to think about how the data is organized as part of this. Scheduling a kick-off meeting for that discussion in Early April.

ianalexmac commented 3 months ago

From our conversation today @eldobbins , I started to explore data organizations. We talked about three directories, which I've created in /data/.

The data/ directory has been built out with three subdirectories, raw, working, and final. Within each of the three is a markdown file with a brief description of what should be there.

At the moment, I have price tables and a few capacity tables in the database. The page prices.qmd is running on the database. capacity.qmd could follow suit, but will need a little tweaking for derived tables and the like. @jikaczmarski , we should chat about this soon.

I'm pivoting to think about code to generate CSV files from the database and make download links. You can see a window into the database on the new data page (live, but not linked in the sidebar, so not quite public).

None of this is permanent, and I'm really looking forward to more talk about organization and workflow.

eldobbins commented 3 months ago

I like this general structure. Could you have subdirectories in raw/ for generation, price, capacity?

ianalexmac commented 3 months ago

It turns out .db and .zip files are both binary format, so not ideal to host on a repo. There was talk of hosting the db on Google Drive, but we may run into permissions issues? The script that builds the database from raw files needs to have write permissions, while the scripts that render the webpage should not have write permissions, correct?

It seems like a good idea to have an action watch the raw data directory and rebuild the database when changes are made. And if we're going to have a zip of all tables, we need that to rebuild upon changes to the database.

It feels like we're slow walking towards a rudimentary pipeline with pub/sub actions and maybe an ephemeral VM for building out the database and zip. GCP rocks for this sort of stuff, but I need to upskill in order to set it up. I'd like to expand my skills in this direction anyways, so it might be the perfect time to learn? @jikaczmarski sounds interested too!

eldobbins commented 3 months ago

Potential new directory layout

ianalexmac commented 2 months ago

There was a lot of discussion about this topic yesterday. Highlights include:

ianalexmac commented 2 months ago

The data page now has table previews and CSV downloads for the 4 tables that we're currently using to generate the visuals.

ianalexmac commented 2 months ago

@jikaczmarski @eldobbins We're at a stopping point on the data page. We could either close this issue or regroup and decide on changes/features (minus #39, adding a metadata parser and corresponding links).

eldobbins commented 2 months ago

Two more items to do:

image
jikaczmarski commented 2 months ago

Added consumption data to the data portal.

ianalexmac commented 2 months ago
ianalexmac commented 2 months ago

Data page is in fine shape for now. Closing this issue.