Closed thcrock closed 3 years ago
Thanks for this @thcrock
database.yaml
thing (is another file, prone to errors, security risk, etc) but if we want to keep that interface we should support the environment variables (DATABASE_URL
or the PG_...
ones)@thcrock weird the crosstabs thingy, last week I ran it from the repo without problems. I will check it to track the bug
It does appear that I'm the only one who's had the crosstabs problem so I will try building my environment again.
As for environment variables: we do support environment variables for database credentials in Triage, but this database provisioner is just creating the credentials, not using them. Any way of attempting to automatically set the environment variables for the user would be shell-specific and OS-specific, and not work until they restart the shell. Do you have something in mind? Just print the credentials to screen and tell the user to set those environment variables?
@nanounanue about the first question: This came up at the meeting. Ultimately it seems the team wants to use it for Dirty duck (as in, remove the docker-compose and bastion dependencies from Dirty Duck). That would require much more work than is present here. However, we did decide that we can merge this PR, as currently scoped, to just be an option for regular Triage users for now. So you might do this once you finish the tutorial and want to start running Triage on your own data.
Yeah, just installed triage from master in a new environment and didn't run into issues with crosstabs in the CLI either.
In terms of testing out here, I did a pip install git+https://github.com/dssg/triage.git@triage_db
in a new environment and then tried running triage db up
but hit this error:
(temp-3.7.2) code $ triage db up
[ProcessExecutionError] Command line: ['/usr/local/bin/docker', 'build', './dirtyduck/food_db/', '-t', 'triage_db']
Exit code: 1
Stderr: | unable to prepare context: path "./dirtyduck/food_db/" not found
I think we might need to change it to look for food_db
relative to the code path?
@shaycrk So far I've been unable to get the contents of the dirtyduck folder to show up in the installed package, which is necessary for that docker image to be buildable.
But maybe I don't need to. For this purpose, are PostGIS and the dirtyduck data necessary? I could more easily just have the triage db command use a vanilla Postgres image, at least for now.
Manifests are a big problem, I solved this issue in the pastñet me check my notes ...
On Sat, Apr 10, 2021, at 16:07, Tristan Crockett wrote:
@shaycrk https://github.com/shaycrk So far I've been unable to get the contents of the dirtyduck folder to show up in the installed package, which is necessary for that docker image to be buildable.
But maybe I don't need to. For this purpose, are PostGIS and the dirtyduck data necessary? I could more easily just have the triage db command use a vanilla Postgres image, at least for now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dssg/triage/pull/840#issuecomment-817202257, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADYXQFGQVSASH2JLV3G6P3TIC4XXANCNFSM42DCBEOA.
@nanounanue I spent some more time on the Manifest. Apparently the 'dirtyduck' directory is successfully getting into the 'sdist' (what the Manifest.in is supposed to do) but is not getting copied to the site-packages/triage directory. This is despite 'include_package_data' being set to True in setup.py
@nanounanue Here is the deal with the manifest. The 'dirtyduck' directory wasn't getting into the build directory despite the MANIFEST.in
and include_package_data=True
combo because it's not within a package directory. If I move it to src/triage, it works.
I tried this based on this reply on a related issue: https://github.com/pypa/sampleproject/issues/30#issuecomment-426993883
Moving it into that directory may change some other things. Before deciding to do this, I think we should confirm that we want the database provisioner to actually include the dirtyduck data scripts. If the user types triage db up
, should they get a vanilla working postgres database or should they get the food database sample data (including the one or two minutes it might take for the system to download the inspections data and load it)? @shaycrk
Thanks @thcrock! I've definitely been thinking about this mostly as motivated by making the dirty duck tutorial a little more seamless (as well as making it possible to run after a pip install triage
without having to clone the repo), though I can certainly see some utility in providing an empty database, too. My sense is it might be nice to support both unless that seems unreasonable (e.g., maybe triage tutorial up
vs triage db up
?). Anyway, we also have our planning meeting tomorrow, so might be good to talk it through then if you can both make it?
Hey @shaycrk this is ready for review again, with just the vanilla db.
One thing you'll note is that the docs/sources/index.md changed a bunch. This is because I ran 'manage docs', which copies README.md to docs/sources/index.md. This is something I put in a while ago to solve the problem of either 1. having content split between the documentation site and the README or 2. having the README be a stub and punt the user to the documentation site.
We can iterate on this process depending on what the team wants, but for now it looks README and docs/sources/index.md were out of sync. I'm not sure which version of this page correct. README was modified much more recently than index.md, so I'm guessing README is the right version to use for now, and if that's true this PR syncs them up again. But check it out in case anything is wrong.
Thanks @thcrock! Added a few inline comments, mostly around things like docs and naming, but looks good to me overall
Thanks @shaycrk I just uploaded some changes based on your review
Thanks @thcrock -- looks good to me! Feel free to merge if you're happy with it
Not quite ready for merge (I would add some documentation around this and maybe clean up the code a bit), but I wanted to get thoughts on this interface for a local Dockerized database provisioner. The idea is Triage users could use this to easily set up a database outside of a Dirtyduck context but without having to do Postgres on their own. With just docker as a dependency, they can:
All using the command 'triage db up'.
Note: I don't know what's going on with the Crosstabs CLI but I couldn't get this to run while using those imports. Did that crosstab code get removed? It's obviously unrelated to this PR but if I can't run the CLI with those lines in there I had to take them out.