Closed DazWilkin closed 6 years ago
Ugh! I've got it mostly working but there's still a race condition with the trillian_db_seed
that results in this not always working initially on a clean (docker system prune
) system :-(
https://github.com/google/trillian/blob/master/examples/deployment/docker-compose.yml
I have docker-compose
working but I'd value some insights into which of the several changes is required. I'm confident it's some combination of several.
host=mysql
does not workThe docker-compose.yml
uses references to mysql
(I believe in reference to service: mysql
) in an attempt to reference the database. For me this does not work. After much consternation, I realized that my debugging included a possible solution. This worked:
adminer:
image: adminer
restart: always
ports:
- 8080:8080
depends_on:
- mysql
links:
- mysql:db
The other containers don't assume (unlike adminer
) that the container is called db
but it struck me that perhaps the links
is necessary and I think (!?) this provides a way to uniquely reference other containers. When I run docker-compose
without links
the database is not actually called mysql
but is called e.g. mysql_1
. Using links for all the containers appears to solve host resolution.
So each service needs:
links:
- mysql:db
And then references to the database become e.g. test:zaphod@tcp(db:3306)/test
I tried running MySQL and MariaDB containers standalone and then connecting to them from a 2nd container as a client. The only way I could get this to work was when I included --protocol=TCP
to force comms over the TCP port. Unfortunately, resetdb.sh
does not include this flag (see next)
docker run \
--env=MYSQL_ROOT_PASSWORD=$DB_PASSWORD \
-it \
--publish=3306:3306 \
mysql:8
Or:
docker run \
--env=MYSQL_ROOT_PASSWORD=$DB_PASSWORD \
-it \
--publish=3306:3306 \
mariadb:10.3.7
And:
docker run \
--interactive \
--tty \
--net=host mysql:8 \
mysql \
--user=root \
--password=${DB_PASSWORD} \
--host=localhost \
--port=3306 \
--protocol=TCP \
--execute="show databases;"
Works reliably.
I'm loathe to propose changes to resetdb.sh
knowing that this is used elsewhere but it appears that this must include the --protocol=TCP
as a flag to address the above issue.
While I was in here I made the password property consistent with the other long-names so -p
--> --password=
The result:
#!/bin/bash
set -e
usage() {
echo "$0 [--force] [--verbose] ..."
echo "accepts environment variables:"
echo " - DB_NAME"
echo " - DB_USER"
echo " - DB_PASSWORD"
echo " - DB_HOST"
echo " - DB_PORT"
echo " - DB_PROT"
}
collect_vars() {
# set unset environment variables to defaults
[ -z ${DB_USER+x} ] && DB_USER="root"
[ -z ${DB_NAME+x} ] && DB_NAME="test"
[ -z ${DB_HOST+x} ] && DB_HOST="localhost"
[ -z ${DB_PORT+x} ] && DB_PORT="3306"
[ -z ${DB_PROT+x} ] && DB_PROT="TCP"
FLAGS=()
# handle flags
FORCE=false
VERBOSE=false
while [[ $# -gt 0 ]]; do
case "$1" in
--force) FORCE=true ;;
--verbose) VERBOSE=true ;;
*) FLAGS+=("$1")
esac
shift 1
done
FLAGS+=(--user="${DB_USER}")
FLAGS+=(--host="${DB_HOST}")
FLAGS+=(--port="${DB_PORT}")
FLAGS+=(--protocol="${DB_PROT}")
# Optionally print flags (before appending password)
[[ ${VERBOSE} = 'true' ]] && echo "- Using MySQL Flags: ${FLAGS[@]}"
# append password if supplied
[ -z ${DB_PASSWORD+x} ] || FLAGS+=(--password="${DB_PASSWORD}")
}
main() {
collect_vars "$@"
readonly TRILLIAN_PATH=$(go list -f '{{.Dir}}' github.com/google/trillian)
# what we're about to do
echo "Warning: about to destroy and reset database '${DB_NAME}'"
[[ ${FORCE} = true ]] || read -p "Are you sure? [Y/N]: " -n 1 -r
echo # Print newline following the above prompt
if [ -z ${REPLY+x} ] || [[ $REPLY =~ ^[Yy]$ ]]
then
echo "Resetting DB..."
echo "Flags: ${FLAGS[@]}"
mysql "${FLAGS[@]}" -e "DROP DATABASE IF EXISTS ${DB_NAME};"
mysql "${FLAGS[@]}" -e "CREATE DATABASE ${DB_NAME};"
mysql "${FLAGS[@]}" -e "GRANT ALL ON ${DB_NAME}.* TO '${DB_NAME}' IDENTIFIED BY 'zaphod';"
mysql "${FLAGS[@]}" -D ${DB_NAME} < ${TRILLIAN_PATH}/storage/mysql/storage.sql
echo "Reset Complete"
fi
}
main "$@"
Because of the minimal Dockerfiles and a desire to make Docker Compose and Kubernetes consistent, I've pulled much of the formatting into the docker-compose.yml
file.
Unfortunately, whereas Kubernetes permits environment variables (values) to be incorporated into the runtime command-line, Docker Compose does not and so the Docker Compose variant of the file is more static than I'd prefer. These variables could possibly (!?) be recreated and provided by the environment (e.g. ${DB_FLAG}
and ${DB_PROVIDER}
etc. as is done with ${DB_PASSWORD}
but for now:
Also dropped -u mysql
from the trillian-db-seed
command it is redundant if DB_HOST=db
is provided too.
version: "3.2"
services:
mysql:
image: mariadb:10.3.7
restart: always
environment:
- MYSQL_ROOT_PASSWORD=${DB_PASSWORD}
trillian-db-seed:
build:
context: ../..
dockerfile: ./examples/deployment/docker/db_client/Dockerfile
depends_on:
- mysql
links:
- mysql:db
environment:
- DB_USER=root
- DB_PASSWORD=${DB_PASSWORD}
- DB_HOST=db
- DB_PORT=3306
- DB_PROT=TCP
command: ./scripts/resetdb.sh --verbose --force
trillian-log-server:
build:
context: ../..
dockerfile: examples/deployment/docker/log_server/Dockerfile.new
restart: always
ports:
- "8090:8090"
- "8091:8091"
depends_on:
- mysql
links:
- mysql:db
environment:
- DB_USER=root
- DB_PASSWORD=$DB_PASSWORD
command: "--mysql_uri=test:zaphod@tcp(db:3306)/test --storage_system=mysql --rpc_endpoint=0.0.0.0:8090 --http_endpoint=0.0.0.0:8091 --alsologtostderr"
trillian-log-signer:
build:
context: ../..
dockerfile: examples/deployment/docker/log_signer/Dockerfile.new
restart: always
ports:
- "8092:8091"
depends_on:
- mysql
links:
- mysql:db
environment: [
"DB_USER=root",
"DB_PASSWORD=$DB_PASSWORD",
]
command: "--mysql_uri=test:zaphod@tcp(db:3306)/test --storage_system=mysql --http_endpoint=0.0.0.0:8091 --sequencer_guard_window=0s --sequencer_interval=300ms --num_sequencers=10 --batch_size=2000 --force_master=true --alsologtostderr"
Feedback and guidance welcome.
I don't know if I can answer all your questions, but in Boulder, we use docker-compose routinely for all our testing, along with a mysql container and an initialization script. See these files:
https://github.com/letsencrypt/boulder/blob/master/docker-compose.yml https://github.com/letsencrypt/boulder/blob/master/test/create_db.sh https://github.com/letsencrypt/boulder/blob/master/test/entrypoint.sh
It's worth noting that links
is deprecated: https://docs.docker.com/compose/compose-file/#links. We recently switched to using aliases
with a lot of success. It doesn't matter what your database container is called. All you really care about is that it be resolvable under a predictable name from the other containers. aliases
accomplishes that.
I think the reason you find yourself needing to add --protocol=TCP
to your mysql command line is that mysql special-cases the hostname "localhost" and tries to use a Unix domain socket to connect to localhost. You can override that with --protocol=TCP
. However, it seems like you are actually running your mysql client commands on a different container, which will reference the mysql container as mysql
(or db
if you like). If you change your mysql commands to have -h mysql
, you probably won't need --protocol=TCP
anymore. As an added benefit, that should allow you to get rid of --publish=3306:3306
, which shouldn't be necessary.
One tip that I ran into recently: When we run docker-compose up
, everything is hunky-dory. When we run things like docker-compose run boulder ping boulder-mysql
, it doesn't work. We realized we needed to run with --use-aliases
. Otherwise docker-compose assumes you don't want the aliases when using run
.
Very helpful. Thank you.
I'm more familiar with Kubernetes and inadvertently broke Trillian's Docker Compose while trying to refine the Kubernetes deployment.
I'll review this tomorrow. Your suggestions return me to a place closer to where the Docker Compose files were but, with those, I could not get the containers taking to the MySqL instance.
Thanks for the guidance!
@jsha Thanks! I have it working and your comment was very helpful.
The Docker Compose works when I define a network and use aliases (at your suggestion).
The Database (container) doesn't need the --port
and --protocol
flags as your suggested.
However, I'm also able to get it working if I don't use mysql
as the service name ...
This, I don't understand :-(
That's closer to how the file was originally (albeit) broken so I'm going to stick closer to it.
Thanks very much for the help!
However, I'm also able to get it working if I don't use mysql as the service name ...
When you say "as the service name," you mean the heading in the yaml file? E.g.
services:
mysql:
That's expected if you're using aliases. In other words, if you configure component X to have an alias "mysql" that points to the MySQL container, then things running in that component can get the IP address of the MySQL container by looking up the name "mysql".
If you want to post your current config I can take a look.
Right! But that's not what wasn't working ;-)
The following (hypothetical) Docker Compose file does not work for me (but it should). test
timesout unable to connect to mysql
:
version: '3'
services:
mysql:
image: mariadb:10.1
environment:
- MYSQL_ROOT_PASSWORD=${DB_PASSWORD}
test:
image: mysql:8
restart: always
depends_on:
- mysql
environment:
- MYSQL_USER="root"
- MYSQL_PWD=${DB_PASSWORD}
command: mysql --host=mysql --execute="show databases;"
yields:
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (110)
If I rename the service to something (anything?) other than mysql
, e.g. db
or xxx
, it works:
version: '3'
services:
xxx:
image: mariadb:10.1
environment:
- MYSQL_ROOT_PASSWORD=${DB_PASSWORD}
test:
image: mysql:8
restart: always
depends_on:
- xxx
environment:
- MYSQL_USER="root"
- MYSQL_PWD=${DB_PASSWORD}
command: mysql --host=xxx --execute="show databases;"
yields:
test_1 | Database
test_1 | information_schema
test_1 | mysql
test_1 | performance_schema
test_1 | Database
test_1 | information_schema
test_1 | mysql
test_1 | performance_schema
What command are you running? The above roughly works for me with docker-compose up
or docker-compose run --use-aliases test
. If you use docker-compose run
without --use-aliases
it definitely won't work.
Also note that there are some timing issues around waiting for mysql to come up before you try to connect to it. Your simplified example here doesn't include a wait-for-it command but I gather from the previous comments that there's a thing called wait-for-it used by the "real" containers, which waits for a given TCP port to become available.
I've been using:
docker-compose --file=$PWD/docker-compose.yml up --remove-orphans
When I want to be more emphatic about cleaning the slate:
docker system prune --force && \
docker-compose --file=$PWD/docker-compose.yml up --remove-orphans --build
Yes, in the case above, service: mysql
takes time to fail but, once it timesout, it's done for good. With service: something-else
, initially timeouts pass as the database boots and then it's golden.
Yes, the Trillian folks use wait-for-it
to help:
./wait-for-it.sh -t 0 mysql:3306 -- ./scripts/resetdb.sh --verbose --force
Here's the logs from mysql
; even after the database container comes ready, the client remains unable to connect to it:
mysql_1 | Version: '10.1.33-MariaDB-1~jessie' socket: '/var/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary distribution
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (110)
deployment_test_1 exited with code 1
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (110)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (110)
Whereas, with anything-other-than-mysql (henryhoops
), it fails quickly until the database comes ready:
henryhoops_1 | Initializing database
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
deployment_test_1 exited with code 1
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
deployment_test_1 exited with code 1
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
deployment_test_1 exited with code 1
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
henryhoops_1 | Version: '10.1.33-MariaDB-1~jessie' socket: '/var/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary distribution
deployment_test_1 exited with code 0
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | ERROR 2003 (HY000): Can't connect to MySQL server on 'henryhoops' (111)
test_1 | Database
test_1 | information_schema
test_1 | mysql
test_1 | performance_schema
test_1 | Database
test_1 | information_schema
test_1 | mysql
test_1 | performance_schema
What's the status on this issue? Still a problem?
I've not used it since.
I recall that we were unable to explain why renaming the services addressed the issue.
I was pursuing the docker-compose
route to unbreak some of the changes I'd introduced to the DOCKERFILE
s and bash scripts before I realized Trillian had a docker-compose
way too.
Thanks for the help, using links: mariadb
as well as the network bridge worked fine.
docker-compose up --build
yields:and
trillian-db-seed
appears to never get a connection (throughwait-for-it
) to the database but I think it's notwait-for-it
that's at fault as if I replace this with a longsleep 15s
and try that way, I get errors too:IIUC it should not be attempting to connect through that socket but the port. The output suggests that it is using
localhost:3306
but this doesn't work even if I change this tomysql
using$DB_HOST
.The database is ready:
If I add
adminer
into the compose file, I'm able to reach the database using it:and then I can
localhost:8080
with${DB_PASSWORD}
In this case, it appears to have gotten as far as creating (a)test
DB but it contains no tables; (b)test
user.There's something in the bowels of docker-compose and|or MySQL(MariaDB) and|or Trillian that I'm not seeing because I'm entirely unable to understand what's broken and how to fix it and would value some inspiration.
and