UFOKN / sophox

A collection of services exposing OSM data, metadata, and other microservices
Apache License 2.0
0 stars 0 forks source link

Sophox

Installation

Full planet Sophox should be installed on a sufficiently large (40+ GB RAM, 1TB Disk) server, preferably SSD NVMe disk. In case of Google Cloud, a local SSD scratch disk is also recommended. Use environment variables to override what data gets loaded. See also the Development section below.

The server must have bash, docker, curl, and git. Everything else is loaded inside docker containers.

When cloning, make sure you get submodules (e.g. git submodule update --init --recursive)

Google Cloud

Hetzner or similar server

We have a machine with 12, 128GB RAM, and 3 SSDs: 2 small ones and a 1.8TB one.

You may need to use "bionic" instead of lsb_release ...

add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

apt-cache policy docker-ce apt update && apt-get install -y docker-ce

Format and mount the large disk, and make it auto-mount. We use xfs, but ext4 is fine too.

mkdir -p /mnt/data && mount -o discard,defaults /dev/sdc /mnt/data echo UUID=blkid -s UUID -o value /dev/sdc /mnt/data xfs discard,defaults,nofail 0 2 | tee -a /etc/fstab


* Install Sophox:

export DATA_DIR=/mnt/data export REPO_BRANCH=main nohup curl --fail --silent --show-error --location --compressed \ https://raw.githubusercontent.com/Sophox/sophox/${REPO_BRANCH}/docker/startup.planet.sh \ | bash >> /mnt/data/startup.log 2>&1 &


### Monitoring
* See docker statistics:  `docker stats`
* View docker containers:  `docker ps`
* See individual docker's log:  `docker logs <container-id>` _(ID can be just the first few digits)_
* `localhost:8080` shows Traefik's configuration and statistics.

## Automated Installation Steps
These steps are done automatically by the startup scripts. Many of the steps create empty `status` files in the `data/status` directory, indicating that a specific step is complete. This prevents full rebuild when the server is restarted.

##### [startup.sh](docker/startup.sh)
* Clone/pull Sophox git repo _(Use `REPO_URL` and `REPO_BRANCH` to override. Set `REPO_URL` to "-" to disable)_* Generate random Postgres password
* Download OSM dump file and validate md5 sum. (creates _status/file.downloaded_)
* Initialize Osmosis state configuration / timestamp (needed for osm2pgsql updates)
* Start PostgreSQL and Blazegraph with [dc-databases.yml](docker/dc-databases.yml) and wait for them to activate
* Run all [dc-importers-*.yml](docker/) to parse downloaded file into RDF TTL files and into Postgres tables. The TTL files are then imported into Blazegraph.  This step runs without the `--detach`, and should take a few days to complete.  Running it a second time should not take any time. Note that if it crashes, you may have to do some manual cleanup steps (e.g. wipe it all clean)
* Run [dc-updaters-*.yml](docker/) and [dc-services-*.yml](docker/). Updaters will update OSM data -> PostgreSQL tables (geoshapes), OSM data->Blazegraph, and OSM Wiki->Blazegraph. 

##### [startup.gcp.sh](docker/startup.gcp.sh)
GCP has additional disk init step done before `startup.sh`:
* If `DATA_DEV` is set, format and mount it as `DATA_DIR`.  Same applies to the optional `TEMP_DEV` + `TEMP_DIR`. _(e.g. `/dev/sdb`  as `/mnt/disks/data`, and `/dev/nvme0n1` as `/mnt/disks/temp`)_

## Development

Clone the repo with submodules.

If you have commit access to the Sophox repository, make sure to run this in order to automatically use ssh instead of https for submodules.

git config --global url.ssh://git@github.com/.insteadOf https://github.com/


For testing, you may want to create a simple script (example below) in the docker directory, e.g. docker/_belize.sh that uses [docker/startup.local.sh](docker/startup.local.sh) to get Sophox locally and with a small OSM file.   Use  http://sophox.localhost  to browse it. You may need to add `127.0.0.1   sophox.localhost` to your `hosts` file.  Make sure your script begins with an underscore (ignored by git).

```bash
#!/usr/bin/env bash

OSM_FILE=belize-latest.osm.pbf
OSM_FILE_REGION=central-america
MAX_MEMORY_MB=5000

### Uncomment any of these to disable a certain service/feature
# ENABLE_IMPORT_OSM2PGSQL=
# ENABLE_IMPORT_OSM2RDF=
# ENABLE_IMPORT_PAGEVIEWS=
# ENABLE_SVC_PROXY=
# ENABLE_SVC_GUI=
# ENABLE_SVC_MISC=
# ENABLE_UPDATE_METADATA=
# ENABLE_UPDATE_OSM2PGSQL=
# ENABLE_UPDATE_OSM2RDF=
# ENABLE_UPDATE_PAGEVIEWS=
# ENABLE_UPDATE_MAINTAIN=

source "$(dirname "$0")/startup.local.sh"

Notes for Mac users