Open psychemedia opened 7 years ago
That's a very good idea. I'm not entirely sure how to accomplish it, but I think the Dockerfile
would probably need something like the following commands to run it:
RUN python /import.py /data/RegPlusExtract_January_2017.zip
RUN mysql -e "CREATE DATABASE ccextract;"
RUN mysql ccextract < table-definition.sql
RUN mysqlimport --local --ignore-lines=1 --fields-terminated-by=, --fields-optionally-enclosed-by="\"" --fields-escaped-by=\ --verbose --delete ccextract data/extract_acct_submit.csv data/extract_aoo_ref.csv data/extract_ar_submit.csv data/extract_charity.csv data/extract_charity_aoo.csv data/extract_class.csv data/extract_class_ref.csv data/extract_financial.csv data/extract_main_charity.csv data/extract_name.csv data/extract_objects.csv data/extract_partb.csv data/extract_registration.csv data/extract_remove_ref.csv data/extract_trustee.csv
(Would need proper login details added to the mysql commands).
Have you got an example of a similar container?
You might find this useful/interesting too: https://github.com/tithebarn/charity-base. Same data, but in MongoDB and with an API.
Closest I've got is PostgreSQL and that's muddled up in https://github.com/psychemedia/ou-tm351/tree/master/TM351_docker_Testing
I really need to do a simple worked teaching example for my own reference/reuse!
There are examples of getting node running in containers at eg https://nodejs.org/en/docs/guides/nodejs-docker-webapp/
That's useful, thanks. I'd be happy to collaborate on a worked example - tbh I find docker and particularly docker-compose
confusing and struggle to get my head around it sometimes!
I made a start but hit issues I don't have time to debug right now:-( (Most of my own reusable scripts are for postgres or mongo).
Init script for use by the Docker file - this breaks when the build process tries to run it (can't connect):
The chinit.sh
file:
#!/bin/bash
echo "Starting up..."
#I'm not sure if we need to start and stop the db daemon?
#Couldn't offhand find a Hello World demo that just worked..!
#http://stackoverflow.com/a/29150538/454773
/usr/sbin/mysqld &
#until mysqladmin ping >/dev/null 2>&1; do
# echo -n "."; sleep 0.2
#done
sleep 5
echo "Creating database..."
mysql -e "CREATE DATABASE ccextract;"
echo "...done"
echo "Extracting..."
mysql ccextract < /tmp/table-definition.sql
echo "...done"
echo "Importing..."
mysqlimport --local --ignore-lines=1 --fields-terminated-by=, --fields-optionally-enclosed-by="\"" --fields-escaped-by=\ --verbose --delete ccextract data/extract_acct_submit.csv data/extract_aoo_ref.csv data/extract_ar_submit.csv data/extract_charity.csv data/extract_charity_aoo.csv data/extract_class.csv data/extract_class_ref.csv data/extract_financial.csv data/extract_main_charity.csv data/extract_name.csv data/extract_objects.csv data/extract_partb.csv data/extract_registration.csv data/extract_remove_ref.csv data/extract_trustee.csv
echo "...done"
#Do we need to shut down any db daemon we started?
#mysqladmin shutdown
echo "...and done"
Dockerfile
to build the MySQL container - this can go in the same directory as other files, along with the data zip file ...
FROM mysql/mysql-server
#We need to copy (ADD) the files we're using from the local directory into the container image
ADD import.py ./import.py
ADD bcp.py ./bcp.py
ADD RegPlusExtract_March_2017.zip data/RegPlusExtract_March_2017.zip
ADD table-definition.sql /tmp/table-definition.sql
RUN python ./import.py data/RegPlusExtract_March_2017.zip
#Anything in /docker-entrypoint-initdb.d/ should run as an init script?
ADD chinit.sh /docker-entrypoint-initdb.d/chinit.sh
#Not sure if we should just explicitly run things directly, at least in testing!
#i.e. comment out the /docker-entrypoint-initdb.d/ ADD and replace with:
#ADD chinit.sh /chinit.sh
#RUN /chinit.sh
The following docker-compose.yml
can also go in the same folder - run it with docker-compose up --build -d
and it will build the MySQL container from the local Dockerfile
, pull down the Jupyter notebook image/container from Dockerhub, start the two containers and link them together:
#docker-compose up --build -d
mysqlcc:
build: .
environment:
MYSQL_ROOT_PASSWORD: charitable
expose:
- "3306"
jupytercc:
image: jupyter/scipy-notebook
links:
- mysqlcc
ports:
- "8888"
volumes:
- .:/home/jovyan/work
Running a python2 notebook, we should then be able to connect using something like the following:
##Use a py2 notebook (most straightforward mysql py package doesn't work in py 3?)
!conda install -y mysql-python
import sqlalchemy
engine = sqlalchemy.create_engine('mysql://root:charitable@mysqlcc:3306')
engine.execute("USE ccextract")
However, on one occasion I did get the db running (not sure how...) I got a disallowed connection message (connection incoming from a disliked IP address (the IP address of the Jupyter notebook). So no idea what's happening there either... (I'm a bit out of practice with this docker stuff; maybe my memories of it "just working" are all false memories!)
Thanks for publishing these scripts. I was wondering if you also have
Dockerfiles
anddocker-compose
scripts that can build eg a MySQL database containing the data in one container, and then demonstrating R or python linked containers for analysing the data in the database?