NYUCCL / psiTurk

An open platform for science on Amazon Mechanical Turk.
https://psiturk.org
MIT License
277 stars 140 forks source link

Feature request: Docker container image #162

Open twiecki opened 9 years ago

twiecki commented 9 years ago

Great project. As a suggestion, I think you guys should look into Docker as a psiTurk in a box. This will make deployment totally trivial. You can also set up docker hub to rebuild the image for every push to master.

jhamrick commented 9 years ago

:+1: to this! I'm happy to take this on, though I probably won't have time until February or so.

twiecki commented 9 years ago

I think this should serve as a good inspiration: https://github.com/jupyter/jupyterhub/blob/master/Dockerfile Copies the user-file and starts the server when the container is started.

jhamrick commented 9 years ago

Yeah, something like that is where I was thinking of starting from (I've been working with some of the jupyterhub docker stuff recently for a class I'm running this spring).

gureckis commented 9 years ago

:+1:, i'm all for this.

twiecki commented 9 years ago

Here's a humble beginning:

FROM ubuntu:14.04

MAINTAINER Thomas Wiecki <thomas.wiecki@gmail.com>

RUN apt-get update && apt-get upgrade -y && apt-get install -y wget libsm6 libxrender1 libfontconfig1 python-pip python-dev libncurses-dev python-flask python-pip

RUN pip install psiturk

ADD .psiturkconfig /root/.psiturkconfig
RUN mkdir /root/myexp
ADD config.txt index.html /root/myexp/
ADD static /root/myexp/static
ADD templates /root/myexp/templates

# Clean up
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
EXPOSE 22362

CMD cd /root/pst; HOME=/root psiturk

Should change this to not run as root but it seems to work. To build: docker build -t psiturk . To run: docker run -p 22362:22362 -it psiturk

twiecki commented 9 years ago

I think psiturk could easily just use the installation and then users could easily inherit from an image on dockerhub.

jhamrick commented 9 years ago

Yeah, I spent a bit of time thinking about the way to do this and there are sort of two options:

  1. Have a general-purpose psiTurk image that has a psiturk entrypoint, and when it is run, require the user to mount the current directory inside the container and then run with that.
  2. Build the image with the intent that it will be inherited from. You could do ONBUILD ADD instead of ADD so then when the inherited image is built it will add the appropriate files. Probably the inherited image would want to include environment variables for the AWS credentials and ad server credentials.
twiecki commented 9 years ago

Good ideas! I feel a bit that number 1 goes against the philosophy of docker as the image would not be independent of the environment anymore. I think the ONBUILD ADD is a pretty nice interface. It works well for e.g. JupyterHub.

braingineer commented 9 years ago

Hi all,

I'm about to begin rolling out some experiments using Docker & Psiturk. I think I've done (1) that @jhamrick suggested and that @twiecki advised possibly not doing. I found the solution below pretty easy, but would you recommend a better way?

3 things below:

  1. a note on a change to the config.txt that baffled me for a while
  2. the Dockerfile I'm currently using
  3. the bash script I use to run an experiment

  1. I found that if host = 0.0.0.0 was not set in the config.txt, I was getting all kinds of problems. But I also have a super hardened, public-facing machine. Maybe this is a de facto step already prior to deployment, but I was unaware.
  2. My dockerfile. Comparing above, mine seems a bit bare/messier
#  the psiturk docker image creation file 
#########################################################

# Base image = Ubuntu
FROM ubuntu

MAINTAINER Brian McMahan

RUN apt-get update
RUN apt-get -y upgrade
RUN apt-get install -y python2.7 python-dev build-essential python-pip
RUN apt-get install -y libncurses5-dev
RUN pip install --upgrade pip
RUN exec bash
RUN pip install psiturk
ADD .psiturkconfig .
EXPOSE 22362
  1. The bash script (run_experimentname_experiment.sh). I have it sitting in a folder structure like:
base_folder/
---- run_foo_experiment.sh
---- run_bar_experiment.sh
---- foo/
--------- config.txt .... etc
---- bar/
--------- config.txt .... etc

and then the bash script (e.g. run_foo_experiment.sh) is:

#!/bin/bash
docker run -it -p 22362:22362 -v $(pwd)/foo:/experiment psiturk:dockerfile sh -c 'cd experiment && psiturk'

and then run with ./run_foo_experiment.sh which logs you into the container. from there you get all the psiturk commands like server on etc. then use ctrl-p, ctrl-q to detach/exit the container and leave it running.


All the best, Brian

braingineer commented 9 years ago

Though, I am currently doing it this way (with command line option -v $(pwd)/foo:/experiment) to share a folder between hard drive and container) because saving data is much simpler. It saves to the sqlite database, and the diskspace is synced with the mounted folder in the docker container.

I have been thinking about spawning a mysql container and linking them so that each running experiment can use that instead. But that seems a bit more complicated and not worth the effort at the moment.

mvdoc commented 8 years ago

This is my attempt to use docker-combine to start a psiturk container + mysql and link them. So far I only tested on my server, and with the psiturk-example, and it seems to work. Thanks to @twiecki and @braingineer for the inspiration from their Dockerfile.

Here's the repo https://github.com/mvdoc/psiturk-docker

My solution as well was to map a local directory to a volume within the container (as @braingineer ) Works well in my opinion. One of the major drawbacks is that I didn't find a way to set the permissions correctly if using another user within the container (using root makes me paranoid). I welcome suggestions :)

adamliter commented 7 years ago

I've started a repository for psiTurk Docerk images (GitHub; DockerHub).

The Dockerfile is (at the time of writing this comment):

FROM ubuntu:latest
MAINTAINER Adam Liter <io@adamliter.org>

ARG PSITURK_VERSION
ENV PSITURK_VERSION=${PSITURK_VERSION:-2.2.3} \
    PSITURK_GLOBAL_CONFIG_LOCATION=/psiturk/

RUN apt-get update -y \
    && apt-get upgrade -y \
    && apt-get install -y \
        python \
        python-pip \
    && pip install --upgrade \
        pip \
        setuptools \
        wheel \
    && pip install --upgrade \
        psiturk==${PSITURK_VERSION} \
    && rm -rf /var/lib/apt

WORKDIR /psiturk

EXPOSE 22362

The idea is that people build on top of this Docker file in their own projects. This approach allows for either of the two options proposed by @jhamrick above.


Mount volume

If you want to mount a volume, you just have to create a Dockerfile with the following content:

FROM adamliter/psiturk:2.2.3 # or adamliter/psiturk:latest

VOLUME ["/psiturk"]

Then just do the following (assuming your experiment is in a subdirectory called experiment):

docker build -t <YOUR_USERNAME>/<YOUR_EXPERIMENT> .
docker run -it --rm --name <YOUR_EXPERIMENT> -p 22362:22362 -v `pwd`/experiment:/psiturk <YOUR_USERNAME>/<YOUR_EXPERIMENT>

Put experiment files in image

On the other hand, if you want the experiment files to actually be part of the image, you can create a Dockerfile with the following content (COPY is generally preferred over ADD):

FROM adamliter/psiturk:2.2.3 # or adamliter/psiturk:latest

COPY ./experiment /psiturk

And then do the following:

docker build -t <YOUR_USERNAME>/<YOUR_EXPERIMENT> .
docker run -it --rm --name <YOUR_EXPERIMENT> -p 22362:22362 <YOUR_USERNAME>/<YOUR_EXPERIMENT>

The former approach, in my opinion, is much preferred for developing and tweaking an experiment. Especially if the idea is to keep the Docker image as minimal as possible, the container doesn't have any text editors, so it's much nicer to be able to edit the files on the host computer and not have to rebuild the image each time you make an edit.

However, the latter approach is probably better for deployment and/or distributing an experiment (perhaps it could even improve on/replace the current psiTurk experiment exchange (haven't looked at it that closely, to be honest ... )).

I'm curious if folks have any feedback/suggestions. I'm relatively new to Docker, so there are probably ways to improve this setup.

After incorporating any feedback I get, I was probably planning on adding images with at least some of the older versions of psiturk. Currently the only tags in the DockerHub repository are latest and 2.2.3 (which are both the same, at the moment).

Also, if this approach is appealing, I'd be happy to work with the psiturk folks to make this more 'official', by, for example, transitioning the DockerHub repo to psiturk/psiturk ( or NYUCCL/psiTurk), etc..


P.S.:

@mvdoc I like where you are going with the docker compose solution. Having something like the above as a starting point would help simplify your implementation a bit, I think. Also, adding in a reverse proxy, such as nginx, might be nice, too, particularly if there's a lot of static content in an experiment.

deargle commented 7 years ago

It'd be sweet if you all got this working on the new OpenShift v3, which uses docker.

And I think nginx is a necessity. Over and over I've seen the counsel -- "DO NOT RUN GUNICORN AS A PUBLIC-FACING SERVER , use a reverse proxy in front,"

adamliter commented 7 years ago

@deargle I'm not familiar with OpenShift (I've always used Linode), but I assume it would be pretty straightforward to do since this is precisely the sort of thing that Docker is designed to make easy.

I've been writing some blog posts about how to do this with Linode, which I plan to make public at some point, too. So hopefully that'd be helpful to folks.

And again, if anyone has any suggestions/feedback on the Docker implementation, please let me know! 😸

adamliter commented 7 years ago

I've made psiTurk Docker images available for some older versions of psiTurk, in case anyone is interested. Everything is here on GitHub and here on Docker Hub. Feedback welcome. Feel free to build on top of these images in your own Dockerfiles. I plan to keep the latest tag up to date with the latest version of psiTurk.

(CC especially @deargle and @mvdoc)


Update: As of 2018-07-28, I've also added a dev tag, which is a tag for a Docker image that uses the development version of psiTurk by installing it from the master branch of this repository.

deargle commented 7 years ago

Fair warning that us_only and approve_requirement didn't work before 2.2.0 https://github.com/NYUCCL/psiTurk/releases/tag/2.2.0

On Sat, Oct 14, 2017 at 9:41 PM Adam Liter notifications@github.com wrote:

I've made psiTurk Docker images available for some older versions of psiTurk, in case anyone is interested. Everything is here on GitHub https://github.com/adamliter/psiTurk-docker and here on Docker Hub https://hub.docker.com/r/adamliter/psiturk/. Feedback welcome. Feel free to build on top of these images in your own Dockerfiles. I plan to keep the latest tag up to date with the latest version of psiTurk.

(CC especially @deargle https://github.com/deargle and @mvdoc https://github.com/mvdoc)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NYUCCL/psiTurk/issues/162#issuecomment-336684104, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHsfa-muCdZMoxCvukD-zBS57qYgN6_ks5ssX8DgaJpZM4DPwWr .

adamliter commented 7 years ago

Thanks @deargle. I've added a warning about this in the README.

twiecki commented 6 years ago

@mvdoc This looks fantastic!