Updated deploy - Githubissues

cooldracula commented 1 year ago

This pr introduces a number of new and updated ansible roles to aid in deploying a full room instance to a new host.

They are designed for a host with a single domain running three services:

rooms-frontend
go-ssb-room
planetary-graphql

You could then access these services from one url, e.g: planetary.name, planetary.name/graphql, and planetary.name/login.

An example inventory and playbook are included in the PR.

I've tested this on two instances with success*(see caveats below). It is mostly automated.

Notes

The Docker Images

The docker images used in the inventory are hosted on my personal repository, but it would be straightforward to build and push the images to a planetarysocial docker repository. I opened PR's in each of the repos for dockerizing the service:

Deploying a new instance

Let's say we want to create the planetary room tigers.cool. The steps for deploying a new instance would be:

create a digital ocean droplet
assign two A records for tigers.cool and *.tigers.cool pointing to the digital ocean droplet
setup an inventory.yaml based off the example, with tigers.cool as the inventory_hostname. The majority of the vars can be left alone, though you will likely wanna set different passwords.
run ansible-playbook -i yr-inventory.yml single-host-playbook.yml

After this is done, you would then manually add the TLS wildcard certificates using certbot by ssh'ing into the host and running a couple invocations.

Updating individual services

Since they are containerized, updating the service would be done in the repos themselves, and you'd then build a new docker image and tag and push them to your docker repository. You could then run the individual role you needed, but passing in the new docker tag. The role handles the restarting of the service and the nginx configuration is kept separate from the maintenance of individual services to make updating easier.

CAVEATS

There are a few caveats with the PR that make it likely not ready to merge. I wanted to open it now though to share progress and start convos and such. The caveats are:

It still requires a manual step for new instances While the configuration of nginx is done through ansible, it is easier at the moment to do the tls stuff by interacting with certbot directly on the host. Essentially you run a command to create certificates, certbot gives you some records to add to your DNS, and then you run certbot again and restart nginx. After that, the configuration is basically done.
There's performance issues with the graphql service I am having a devil of a time on my own instances with getting the graphql service to stay running. It eats up memory and seems to be having a memory leak. I tried to put in some memory controls with the docker container, but they had little effect. I did a small amount of digging and there seems to be a common issue with memory leaks in the apollo graphql client our service is using (ex1,ex2). At the moment, I can get everything to work it only stays up for a few minutes before the graphql container dies. I would love to pair on this to see where the issue might be. This setup doesn't feel production ready until this part is solved :(.
Blobs don't work The blob service associated with the graphql instance works, but I wasn't quite sure of the routes we intended the blobs to have. The frontend seems to do a request for speciifc id's, but other examples in the script makes mention of a /blob/ path, a/get/, path and a/blob/get` path. I can update this PR with the correct routing once I undersatnd it better.
alias redirection doesn't work This is similar to blobs, in that I wasn't sure the exact routes we wanted. The wildcard certs work in general, and I can update the proxy config easily after I build a better understanding.

I would b happy to give a demo to show the flow some more and to discuss this and all of that!

CLAassistant commented 1 year ago

All committers have signed the CLA.

chereseeriepa commented 1 year ago

👏 👏 Wow @cooldracula this is awesome! Thanks for the detailed notes, it made this really clear to understand! 💟 I havent taken a look at the PR itself yet, thought id start off by responding to some of the caveats you mentioned above:

1. It still requires a manual step for new instances

I think thats fine, as long as there are clear docs so someone like me who is unfamiliar with this step can "copy and paste" a few commands to achieve this step (if possible)

2. There's performance issues with the graphql service

Im hoping you resolved this issue with @mixmix 😄

3. Blobs don't work

The reason you are seeing different paths is because I changed the path for the graphql server running on graphql.planetary.pub to allow the blob server to be available at https://graphql.planetary.pub/blob/{blobId}, because that server is not on the same host as the room server it is paired with.

I think blobs should also be served from the same URL as the other services e.g. planetary.name/blob/{blobId}, which routes to http://localhost:26835/get/{blobId}. This will also help with the CSP issues you get from Safari (i think).

4. alias redirection doesn't work

Oh yeah, so the address the aliases need to be redirected to is:

cherese.planetary.name => planetary.name/profile/alias/cherese

I made the /profile/alias/{alias} route on rooms-frontend which handles loading the page for an alias

Hope these responses help, if you need more info im happy to help

planetary-social / ansible-scripts

Updated deploy #12