= trends.tf
Source code for https://trends.tf/[trends.tf]
== Getting started
Create a virtualenv and activate it
$ virtualenv venv
$ source venv/bin/activate
Install trends.tf inside that virtualenv
$ ./setup.py develop
Once you have made your changes, you can run some basic tests
$ pytest
This takes around 10 seconds on my machine. To run a development server, you will first need to
set up a database.
=== Setting up a database
First, install postgresql and https://www.postgresql.org/docs/current/app-initdb.html[initialize the
cluster], if that wasn't done automatically. You may also need to start the database by running
$ sudo systemctl start postgresql
Next, create a postgres user for yourself
$ sudo -u postgres createuser --interactive $USER
And then create a database owned by your user
$ sudo -u postgres createdb -O $USER trends
Verify that you can connect to the database by running
$ psql -d trends
You can exit the psql
shell using the \q
command. Finally, before importing any data, run
$ sudo -u postgres psql -d trends -f trends/bloom.sql
to enable the bloom index extension (which must be done as a superuser).
=== Importing data
There are several different ways to import new logs. The simplest is to use the test data set. To
populate your database, run
$ test/create.py postgresql:///trends
Any valid https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING[connection
string] may be used instead of postgresql:///trends
. This data set is small, and mostly contains
unusual logs which caused bugs in the past.
The performance of trends.tf changes significantly as more data is imported into the database. To
quickly import many logs for testing, you can use the output of the
https://github.com/ldesgoui/clone_logs[`clone_logstool]. Cloned logs in batches of 100,000 are available from https://mega.nz/#F!l9oGiKCb!lTWT2RSkTYv-TJZb92_ksA[Lucas Desgouilles's archive]. For example, to import logs from the file
2200k-to-2300k.7z`, run
$ 7z x 2200k-to-2300k.7z
$ trends_importer -vv logs clone_logs -d 22.sqlite3 postgresql:///trends
=== Running the server
To launch a development server, run
$ FLASK_DEBUG=1 development python -m trends.site
If you are using a different database, you will need to set the DATABASE
environmental variable to
its connection string. If you are using an older version of flask, you will need to set
FLASK_ENV=development
instead of FLASK_DEBUG=1
.
== Deploying trends.tf
These are the instructions to deploy a new server hosting trends.tf. The only requirements are that
it be running Arch Linux or Debian (Ubuntu will also likely work), and that there is ssh access
available (for the root user).
First, edit salt/roster
to contain a new entry like
[source,yaml]
:
host:
user: root
----
Where `` is the name you will use to refer to the server in the upcoming commands, and
`` is the IP address of the server. Then, run
$ salt-ssh -i test.ping
The `-i` option will accept the server's host SSH key. You should see a response like
:
True
The next few steps will create a user with ssh access and sudoer privileges. Create a link to the
ssh key you wish to install:
$ ln -s ~/.ssh/id_rsa.pub salt/
You will need to do the same thing for ca.crt if you want to access netdata. Then, run
$ salt-ssh state.apply bootstrap
The output of this command should show all changes made to the server. Next, switch to using the
`sean` user to connect to the server. Modify the entry you previously made in `salt/roster` to be
[source,yaml]
----
:
host:
user: sean
sudo: True
----
Then, run
$ salt-ssh state.apply
This will take a while (around 2 minutes). Once it is done, all states should have succeeded except
for `nginx.service` and `player_import.service`. This is because the certificates are missing and
the steamkey has not been set. To fix the first problem, copy over the certificates from an existing
site. That is, run
[server1]$ sudo tar -cvf /tmp/certs.tar.gz /etc/letsencrypt/
$ scp :/tmp/certs.tar.gz :/tmp/
[server2]$ sudo tar -xvf /tmp/certs.tar.gz -C /
I would like to fix this at some point. Then, create a file `/etc/default/trends` containing
----
STEAMKEY=
----
with your https://steamcommunity.com/dev/apikey[Steam API key]. If you are using sentry, you can
also set `SENTRY_DSN` here. Set the permissions of this file to `600`. Once this is complete,
re-apply the top-level state.
$ salt-ssh state.apply
To support backups, first create a new application key with read/write access to the backup bucket.
Create a file `/etc/default/restic` containing `B2_ACCOUNT_ID`, `B2_ACCOUNT_KEY`,
`RESTIC_REPOSITORY`, and `RESTIC_PASSWORD`. Set its permissions to `600`.
To support uploading to grafana, create a new application key with the metrics push role. Then,
create a file `/etc/prometheus_pass` containing the key. Set its permissions to `640`, and set the
`user:group` to `root:grafana-agent`.
== Maintenance tasks
=== Restoring a backup
To restore a database (making sure that the variables outlined above are exported), run
# mkdir /srv/postgres/restored
# chown sean:sean /srv/postgres/restored
$ restic dump latest trends.tar.zst | unzstd - | tar -xC /srv/postgres/restored
$ pg_verifybackup /srv/postgres/restored
# chown -R postgres:postgres /srv/postgres/restored
# systemctl stop postgresql@XX-data
# rm -rf /srv/postgres/data
# mv /srv/postgres/restored /srv/postgres/data
# systemctl start postgresql@XX-data
You can also specify a different snapshot instead of `latest` by using an ID from `restic
snapshots`.
=== Compressing old logs' json
Old logs' json can be compressed using citus's columnar storage. Once a partition is no longer being
updated, it can be converted to columnar access. Rows will no longer be able to be updated or
deleted, but this reduces the size of the table by around 1/2 (and the process is reversable).
If you haven't done so already, enable citus (as postgres)
=# CREATE EXTENSION citus;
To convert a partition, run
=> -- https://github.com/citusdata/citus/issues/5918
=> SET columnar.chunk_group_row_limit = 5000;
=> SELECT alter_table_set_access_method('log_json_XXe5', 'columnar');
replacing `XX` as appropriate. To convert back, use `heap` instead of `columnar`.
=== Upgrading PostgreSQL
In the following notes, XX refers to the old postgres version, and YY refers to the new version. ZZ
refers to the version of `citus`. First, install the new version of postgres:
# apt-get install postgresql-YY postgresql-YY-citus-ZZ
Make sure this version of citus is also supported by the old postgres (`postgresql-XX-citus-ZZ`).
Upgrade if you have to (`ALTER EXTENSION citus UPDATE`).
Stop the old cluster.
# systemctl stop postgresql@XX-data
Next, move the old cluster to a new location and create a new cluster.
# mv /srv/postgres /srv/postgresXX
# mv /var/lib/postgres /var/lib/postgresXX
# ln -sf /var/lib/postgresXX/data/pg_wal /srv/postgresXX/data/pg_wal
# mkdir -p /srv/postgres/data /var/lib/postgres/data/pg_wal
# chown -R postgres:postgres /srv/postgres /var/lib/postgres/data
$ pg_createcluster YY data -d /srv/postgres/data
Edit `/etc/postgresql/XX/data/postgresql.conf` and update `data_directory`. Then, restart the old
cluster
# systemctl start postgresql@XX-data
Now, check to see if the upgrade is compatible:
$ /usr/lib/postgresql/YY/bin/pg_upgrade --check -d /etc/postgresql/XX/data/ \
-b /usr/lib/postgresql/XX/bin -D /srv/postgres/data -B /usr/lib/postgresql/YY/bin
If it is, stop the old cluster
# systemctl stop postgresql@XX-data
and run the above command without `--check`. After the upgrade, edit the configs and swap the
ports. The new cluster will be using port 5433; change it to 5432.
# vim /etc/postgresql/{XX,YY}/data/postgresql.conf
# cp /etc/postgresql/{XX,YY}/data/conf.d/override.conf
Now start the database, update extensions, and vacuum:
# systemctl stop postgresql@YY-data
$ psql -p 5433 -d trends -f update_extensions.sql
$ vacuumdb -p 5433 --all --analyze --verbose