GMOD / docker-apollo

:whale: Apollo 2.X Docker Image
GNU General Public License v3.0
10 stars 12 forks source link

Using webapollo docker with postgresql docker #17

Closed fikipollo closed 7 years ago

fikipollo commented 7 years ago

Hi there!

First, thanks for this dockerized version of webapollo! It's really useful!

I've forked this repository in order to customize a little bit the code. I found that the current version of the docker uses an internal postgresql service which does not allow to easily store the databases in a volume for data persistence. Alternatively, I'm using a external postgres server running in a official postgres docker. I'm pretty sure that this is something that you were planning to incorporate to your docker image so I hope you appreciate my contribution ;)

I've changed 2 files: apollo-config.groovy and launch.sh

apollo-config.groovy I've added a new enviroment WEBAPOLLO_DB_HOST that can be used to set the IP address or the name for the postgres server. This variable replaces the one called WEBAPOLLO_DB_URI because mine can be used for both the apollo and the chado databases. Now the URL for database is dinamically obtained.

    production {
        dataSource {
            dbCreate = "update"
            username = System.getenv("WEBAPOLLO_DB_USERNAME") ?: "apollo"
            password = System.getenv("WEBAPOLLO_DB_PASSWORD") ?: "apollo"

            driverClassName = "org.postgresql.Driver"
            dialect = "org.hibernate.dialect.PostgresPlusDialect"
            url = "jdbc:postgresql://" + (System.getenv("WEBAPOLLO_DB_HOST") ?: "127.0.0.1").replaceAll(~/\\/$/, "") + "/apollo"
....

        dataSource_chado {
            dbCreate = "update"
            username = "apollo"
            password = "apollo"

            driverClassName = "org.postgresql.Driver"
            dialect = "org.hibernate.dialect.PostgresPlusDialect"
            url = "jdbc:postgresql://" + (System.getenv("WEBAPOLLO_DB_HOST") ?: "127.0.0.1").replaceAll(~/\\/$/, "") + "/chado"

...

launch.sh Many changes in this file. First of all I check if the variable WEBAPOLLO_DB_HOST has been set. If so, we assume that we want to use an external postgres so we don't need to start our internal postgres service. If the variable exists, we will need to use the "-h HOST" flag in all the postgres calls. Then, I check if the apollo database already exists in the postgres server. If not, we create the new database, create the apollo user and set the privileges. After that, we do the same for the chado database. Using this approach we avoid running the chado.sql script every time that the docker starts (user.sql is not necessary anymore). Finally, we restart tomcat but using the shutdown.sh/startup.sh binaries instead of the "service tomcat start" which sometimes is a little buggy. In a normal situation the script will continue showing the content of catalina.out until the container is stopped, however I found that sometimes this file does not exist (first time launch?) so, to avoid possible errors, I force the creation of the file using "touch" (actually no I realize that the if-then clause is not necessary).

#!/bin/bash

if [[ "${WEBAPOLLO_DB_HOST}" == "" ]]; then
    echo "Using internal postgresql service..."
    service postgresql start
    WEBAPOLLO_DB_HOST=127.0.0.1
else
    echo "Using external postgresql service (${WEBAPOLLO_DB_HOST})..."
    HOST_FLAG="-h ${WEBAPOLLO_DB_HOST}"
fi

until pg_isready $HOST_FLAG; do
    echo -n "."
    sleep 1;
done

echo "Postgres is up..."
su postgres -c "psql -lqt | cut -d \| -f 1 | grep -qw apollo"
if [[ "$?" == "1" ]]; then
    echo "Apollo database not found, creating..."
    su postgres -c "createdb $HOST_FLAG apollo"
    su postgres -c "psql $HOST_FLAG -c \"CREATE USER apollo WITH PASSWORD 'apollo';\""
    su postgres -c "psql $HOST_FLAG -c 'GRANT ALL PRIVILEGES ON DATABASE \"apollo\" to apollo;'"
fi

su postgres -c "psql -lqt | cut -d \| -f 1 | grep -qw chado"
if [[ "$?" == "1" ]]; then
    echo "Chado database not found, creating..."
    su postgres -c "createdb $HOST_FLAG chado"
    su postgres -c "psql $HOST_FLAG -c 'GRANT ALL PRIVILEGES ON DATABASE \"chado\" to apollo;'"
    su postgres -c "PGPASSWORD=apollo psql -U apollo -h ${WEBAPOLLO_DB_HOST} chado -f /chado.sql"
fi

# https://tomcat.apache.org/tomcat-8.0-doc/config/context.html#Naming
FIXED_CTX=$(echo "${CONTEXT_PATH}" | sed 's|/|#|g')
WAR_FILE=${CATALINA_HOME}/webapps/${FIXED_CTX}.war

cp ${CATALINA_HOME}/apollo.war ${WAR_FILE}

${CATALINA_HOME}/bin/shutdown.sh
${CATALINA_HOME}/bin/startup.sh

if [[ ! -f "${CATALINA_HOME}/logs/catalina.out" ]]; then
    touch ${CATALINA_HOME}/logs/catalina.out
fi

tail -f ${CATALINA_HOME}/logs/catalina.out

And finally I've also created a docker-compose.yml file that launches webapollo + postgres, and that accepts different environment variables for customizing the instance (it looks like only WEBAPOLLO_DB_HOST is working at this moment).

I hope that helps!

Thanks and best regards!!

nathandunn commented 7 years ago

We've been hashing through this a bit lately, so a good time to provide input.

I'll try to put a PR together and I'll flag you for review.

Thanks.

hexylena commented 7 years ago

Ah, I am sorry you did not ask first. I'm the original author of the apollo docker containers. I originally did not even build a unified image, I have only ever used an apollo-only image. The work you've talked about is mostly already done. @nathandunn is re-doing his versions of the images that GMOD maintains.

E.g. I have

My images (the original apollo containers) did not use the service start method, as they were based off of the standard tomcat container. The GMOD containers switched to an ubuntu base.

nathandunn commented 7 years ago

@erasche Yeah, there is a lot of overlap between the two solutions, but I think that a lot of the DB handling is nicer then what we currently have.

fikipollo commented 7 years ago

OK, I see... so which one is the most "official" version? I'm a little bit confused here :S I was using @nathandunn solution as I understood that webapollo is part of the GMOD project.

@erasche your solution looks great! Actually your docker image was the first that I downloaded from hub.docker.com but I saw that the docker build was failing from 6 months ago and I assumed that your image was not maintained anymore.

In any case, thanks both for the help!!

hexylena commented 7 years ago

@fikipollo I've moved my images to quay.io because docker hub is a massive pain. I'm sorry, I should have updated the README there and pointed to the newer/correct location. https://quay.io/repository/erasche/apollo I will do that shortly. The GMOD maintains their own images, and these are "official", however I do not often contribute my changes back to them. I do not believe they deploy apollo from docker, but I do, so I end up changing my images a bit more often in response to things I need. Unfortunately I do not provide an image for "apollo, from master, non-unified", I only have ones for

I'll add one for master + split image without remote_user, the more common deployment method.

@nathandunn I disagree

fikipollo commented 7 years ago

OK, cool! I will take a look to your images @erasche !

Thanks again to both!

nathandunn commented 7 years ago

@erasche Yeah, I like the greater configuration as well. Was going to merge all of the good things (correct defaults) and remove the bad ones (cruft and in-configurability) ;)

That being said, I didn't realize you had that nice removal of build files from the image. I'm testing that right now in quay. I can't imagine why we would need to keep it up there. Just an oversight on my part.

I'll update the doc once I get it working.

Thanks.

nathandunn commented 7 years ago

FYI: https://quay.io/repository/gmod/docker-apollo

fikipollo commented 7 years ago

The new version using alpine is great and saves a huge amount of space! Nice work!

Just FYI, I was working on the idea of allowing users to interactively prepare the data within the docker (i.e. using an interactive command line) using the jbrowse tools, and I realized that these tools are present in the image even after removing the /apollo directory after building. The jbrowse tools are included in the WAR file and are available right after the WAR file is unpacked, however some necessary perl modules are not installed during Apollo deployment.

To solve this issue, I just needed to install perl perl-dev db db-dev and wget before building Apollo. When these packages are available, jbrowse is automatically built and the perl modules are installed. After that, in a interactive terminal I can run the tools and generate my data for WebApollo. E.g. perl /usr/local/tomcat/webapps/ROOT/jbrowse/bin/prepare-refseqs.pl --fasta /data/raw/scf1117875582023.fa --out /data/tracks