Closed gschivley closed 5 years ago
Would you be willing to just go ahead and update the docs and make a PR? I'm on Linux, and we start over from scratch rarely... so this kind of blank slate revision is very useful.
Sure. What is the expected way to define the data to be downloaded vs initialized in the database? Is it arguments and settings.py? Having all of CEMS downloaded by default seems like it might catch users by surprise. I modified settings.py before runing init_pudl.py
and assumed that it would be used for data download.
@gschivley, my understanding is that postgres.app defaults to trusting any connection, so there's no need for a .pgpass file
Right now the datastore and the DB initialization are managed completely separately, though I could see update_datastore.py
reading the same settings.py
file to figure out what it ought to download. But at the moment it's command line arguments for update_datastore.py
which determine what gets downloaded, and the years listed for each data source in settings.py
that determines what gets pulled into the DB.
@gschivley, my understanding is that postgres.app defaults to trusting any connection, so there's no need for a .pgpass file
Ok, interesting. I couldn't add/drop tables as the catalyst user (with the reset_db.sh file). Had to start psql and do it as the default user (Greg). Is that related?
@gschivley Maybe it depends which user created / owned the database? When I was on a Mac I also recall never having to deal with any permissions stuff.
@gschivley Maybe it depends which user created / owned the database? When I was on a Mac I also recall never having to deal with any permissions stuff.
No idea. Maybe I'll add a quick note about it in the getting started.
@gschivley Did you feel like the updated getting started document was satisfactory?
Much better. A note about updating multiple data sources with different calls to update_datastore.py
might be nice. Sounds simple but seeing the CI code pulling in one source at a time was helpful.
python update_datastore.py -s eia923 -y 2017
python update_datastore.py -s eia860 -y 2017
python update_datastore.py -s epaipm
I installed PUDL and built a postgres database from scratch on a Mac. Along the way I encountered a few issues (see below). The mac/linux/windows docs aren't always consistent, which should also be fixed.
PostgreSQL
They should probably also say to click initialize
sudo mkdir -p /etc/paths.d && echo /Applications/Postgres.app/Contents/Versions/latest/bin | sudo tee /etc/paths.d/postgresapp
) and expected output ofwhich psql
would be helpful.Other issues
pip install -e .
python update_datastore.py --sources ferc1 eia923 eia860 --years 2014 2015 2016 2017
) - add this to the Mac and windows instructionspython init_pudl.py -f local_settings.yml
, which I think is the same file but with a different name? Either way, it would be nice to have a description of settings for downloading data with the data store vs loading things into postgres.