aquametalabs / aquameta

Web development platform built entirely in PostgreSQL
GNU General Public License v3.0
1.1k stars 52 forks source link

Enable installation through de facto package managers #208

Open micburks opened 4 years ago

micburks commented 4 years ago

I was attempting to make an install script for MacOS. Making this issue to start a discussion about the experience.


For debian/ubuntu systems, the install script works really well. It falls short in 2 ways right now:

Ideal world for me, aquameta install would look something like

sudo apt-get install postgresql-11 aquameta
aquameta init

# or on mac
brew install aquametalabs/aquameta
aquameta init

I'm not sure I've ever gotten a full install on a mac, mostly due to python issues, so I usually remove the fs layer and use something other than uwsgi.

Separate conversations:

I'm not trying to hate on python right now, I'm just a little frustrated with trying to shoehorn this install into MacOS.


What could be done

Developing a release pipeline and using de facto package managers would make aquameta more attractive to first-time users and hopefully give this project a better focus on delivering on a regular cadence.

For brew, it seems like you only need to deploy a tar and create a little ruby script to run the install commands.

For deb, you may be able to create and host an apt repository from GitHub or S3 or something. Not super sure on this.

First steps

erichanson commented 3 years ago

I've been thinking about pruning as many dependencies as possible, with this great goal in mind.

Aquameta does more than PostgreSQL does out of the box, and some of these dependencies are functionally essential, but could probably be replaced by something simpler to distribute. I've been looking at Rust and am pretty convinced it is the future of Aquameta's forays.

Other dependencies are nice-to-haves but could be dropped from the core distribution.

Here's the short list of dependencies present right now:

See also the required apt packages.

Some thoughts:

Apps that have complex server-side business logic are still going to need something besides PL/pgsql. plpythonu might still be the best choice of languages for that, but also it might not be. Regardless, if we got it out of core, people could get up and running without having to deal with all that and then choose and configure additional languages as desired.

themightychris commented 3 years ago

I'd highly recommend looking at using Chef Habitat as your main way to package/distribute something like this, it will totally eliminate your dependency list being such a major factor for users

erichanson commented 3 years ago

I spent some time looking into Chef Habitat. I sort of get what they are going for, but regardless I think I'm aiming for a much smaller footprint, not adding more complexity to the stack. There's really almost nothing to orchestrate if we can get things down to just a simple binary http server that connects to PostgreSQL.

I found out that allegedly you can embed a uWSGI application into uWSGI itself and build a distributable binary. It would be pretty slick to just build the endpoint server into uWSGI and off we go.

Spent some time looking into Rust and got increasingly intimidated. A talk by Jef Davis shows a lot of light at the end of the tunnel, and long-term I still think Rust might be the way to go, but it requires a lot of knowledge of C and PostgreSQL internals that are a bit over my head.

Also looked into Go, which looks very doable, and someone has done something similar with pREST, a REST server for PostgreSQL programmed in Go similar to PostgREST.

erichanson commented 3 years ago

@themightychris Can you say more about how we could use Chef Habitat to deploy Aquameta for multiple platforms? I have never packaged a piece of software for multiple platforms before and know almost nothing about it. I read through their website and marketing materials, but still don't quite get how it would fit in or what problems it is solving.

erichanson commented 3 years ago

I went down the rabbit hole of trying to compile uwsgi with the Aquameta endpoint app embedded in it, following these instructions. Got it to build, but couldn't figure out how to get it to embed all the module dependencies (psycopg2, werkzeug, etc.), and it still required the libpython2.7.so.1.0 library. Seems like it's a pretty alpha approach, off the beaten path. See also A bottle app+uWSGI embedded in the same binary, issues... Giving up and learning Go, issue #213.

micburks commented 3 years ago

I think a "single binary" distribution of the Aquameta server would be great, but I want to mention that the goal could be generalized to "reliable, cross-platform distribution". One technology I want to mention, specifically because I think it's super interesting and I want to work more with it, is the Nix package manager. You get hermetic dependencies and platform-specific builds. It even uses a graph database to store/link packages. Dependencies as data... peculiar little idea ;)

As for Chef Habitat, it sounds to me like it's just a DevOps pipeline for Docker images. Which I'm sure is great, but Docker has always been the Aquameta example of what is wrong with bloated software distribution.

erichanson commented 3 years ago

I think a "single binary" distribution of the Aquameta server would be great, but I want to mention that the goal could be generalized to "reliable, cross-platform distribution".

Yeah I think that's the goal (what you said), I'm just hyper-focused on pruning as many dependencies from the tree as possible right now since whatever is still left over will kinda drive the requirements. Even if we got it down to zero deps, it still needs to be compiled for all the architectures and then packaged into all the package systems, ideally in an automated, release-based system. You know about the state of the art for that pipeline?

One technology I want to mention, specifically because I think it's super interesting and I want to work more with it, is the Nix package manager. You get hermetic dependencies and platform-specific builds. It even uses a graph database to store/link packages. Dependencies as data... peculiar little idea ;)

Heh. nix_fdw? I'll check it out. It would be really cool to have some sort of database-centric (aka using SQL) way to affect the system outside the database, the OS and packages and all that.

doublemarked commented 3 years ago

I feel like I’ve been having or witnessing this conversation in some form or another for the past 20 years 🙄

erichanson commented 3 years ago

I feel like I’ve been having or witnessing this conversation in some form or another for the past 20 years 🙄

There are few topics that have me self-censoring curse words and trying to regulate my abject loathing more than this one. It's the worst, and the incorrect decisions were made decades ago, and nothing additive is going to correct course. I'm revolting! Er.

erichanson commented 3 years ago

Go. Single self-contained binary. Go!

erichanson commented 3 years ago

Thought I'd post an update on this epic goal. We're getting close:

Aquameta now has zero external dependencies as far as I can tell. The HTTP server has been reimplemented in Go. I've pulled out multicorn and plv8 and anything that depends on plpython and either just eliminated it from core, or reimplemented it with plgo. Server-side templates aren't completed yet but it's in the works and I'm just working around it for now with static HTML resources.

PLGO is really nice in theory but fairly immature at this point. You can't create PLGO functions with arguments of any types except the ones they support, which is only a subset of what is in Postgres core, and doesn't include json or uuid. There's no technical hurdles here afaict other than just doing it. Making background workers in Go seems quite feasible as well, the prest guys implemented one.

The Go server also includes an embedded version of PostgreSQL thanks to the embedded-postgres project, which leans on zonkyio postgres binaries for any recent version of Postgres, for any common architecture. These binaries just include a very minimal Postgres server -- It doesn't have a pg_config or psql client, among lots of other things, just the postgres server binary, pg_ctl and initdb. I think the intent of the project was to just spin up a server for use in testing and then destroy it at the end of the test -- not to use it the way we are as an actual persistent database. I had to make some modifications to the project to get it to not destroy the database when it shuts down, so I'm pulling from a fork of the project here.

We might want to start shipping a more complete binary down the line, or just encourage people to run their own server, but the embedded postgres is great for getting something quick and simple up and running.

I've been experimenting with xgo and had good success compiling the new Go server for all the architectures it supports (basically everything I've ever heard of, and a lot I hadn't). So, once we have a binary for all the platforms, getting from their to a .dmg or .deb or .rpm or .exe isn't far at all. I have heard tell of toolchains that help out with this but know nothing about them yet. Suggestions welcome.