brmson / yodaqa

A Question Answering system built on top of the Apache UIMA framework.
http://ailao.eu/yodaqa
Other
619 stars 205 forks source link

Docker Containerization & Orchestration #41

Closed k0105 closed 8 years ago

k0105 commented 8 years ago

So far I have only provided one Dockerfile for Yoda itself, which is nice and all, but containerization only really makes sense if we put all backends in containers, because then we can freely distribute and orchestrate them. However, for this we need to map huge DBs via volumes (which also requires adding according SELinux rules on systems that use it). I have done this for the label services. Based on this Dockerfile it should be straightforward to put any backend into a container. Here goes:

################################
## YodaQA Label Service Image ##
################################

# Proposed image name: label-service

# Inherit Debian image
FROM debian:jessie

# Update and install dependencies [cmp. https://docs.docker.com/engine/articles/dockerfile_best-practices/]
RUN apt-get update && apt-get install -y \
    curl \
    git  \
    pypy

RUN git clone https://github.com/brmson/label-lookup.git
# If we were to copy label-service files into image
#ADD ./label-service /label-service

RUN cd label-lookup
RUN curl -O https://bootstrap.pypa.io/get-pip.py
# If you run this on an actual system instead of a container: The following 3 commands need root privileges
RUN pypy get-pip.py
RUN mv /usr/local/bin/pip ./pypy_pip
RUN ./pypy_pip install flask SPARQLWrapper

# Same as "export TERM=dumb"; prevents error "Could not open terminal for stdout: $TERM not set"
ENV TERM dumb

# Define working directory
WORKDIR /label-lookup

# Expose port
EXPOSE 5000
EXPOSE 5001

##########
# BEWARE #####################################################################################
# With SELinux you need to run chcon -Rt svirt_sandbox_file_t /home/<user>/docker/docker_shared/
##############################################################################################

# Can be built with: "docker build -t label-service ."

# docker run -it -v /home/<user>/docker/docker_shared/:/shared --entrypoint="pypy" -p 5000:5000 label-service /label-lookup/lookup-service.py /shared/sorted_list.dat
# docker run -it -v /home/<user>/docker/docker_shared/:/shared --entrypoint="pypy" -p 5001:5001 label-service /label-lookup/lookup-service-sqlite.py /shared/labels.db
# RUN pypy lookup-service.py /shared/sorted_list.dat is done in run command; need to map sorted_list.dat as volume (read-only)

# Can be used as usual: curl 127.0.0.1:5000/search/AlbaniaPeople

I will add more Dockerfiles as soon as I find the time and show how to orchestrate them, but for now this example should already demonstrate all tricks that were still needed to accomplish the task.

Best wishes, Joe

k0105 commented 8 years ago
############################
## Freebase Service Image ##
############################

# Proposed image name: freebase
# Readme https://github.com/brmson/yodaqa/tree/master/data/freebase

# Inherit official Java image, see https://hub.docker.com/_/java/
FROM java:8

# Copy fuseki files
ADD ./jena-fuseki-1.1.1 /jena-fuseki-1.1.1

# Same as "export TERM=dumb"; prevents error "Could not open terminal for stdout: $TERM not set"
ENV TERM dumb

# Define working directory
WORKDIR /jena-fuseki-1.1.1

# Expose port
EXPOSE 3030

##########
# BEWARE #####################################################################################
# With SELinux you need to run chcon -Rt svirt_sandbox_file_t /run/media/<user>/<longid>/home/<user>/Downloads/Backends/jena-fuseki-1.1.1/d-freebase/
##############################################################################################

# Can be built with: "docker build -t freebase ."

# docker run -it -v /run/media/<user>/<longid>/home/<user>/Downloads/Backends/jena-fuseki-1.1.1/d-freebase/:/jena-fuseki-1.1.1/d-freebase/ --entrypoint="./fuseki-server" -p 3030:3030 freebase --loc d-freebase /freebase
# RUN ./fuseki-server --loc d-freebase /freebase is done in run command; need to map  as volume (read-only via :ro)
k0105 commented 8 years ago
###########################
## DBpedia Service Image ##
###########################

# Proposed image name: dbpedia
# Readme https://github.com/brmson/yodaqa/blob/master/data/dbpedia/

# Inherit official Java image, see https://hub.docker.com/_/java/
FROM java:8

# Copy fuseki files
ADD ./jena-fuseki-1.1.1 /jena-fuseki-1.1.1

# Same as "export TERM=dumb"; prevents error "Could not open terminal for stdout: $TERM not set"
ENV TERM dumb

# Define working directory
WORKDIR /jena-fuseki-1.1.1

# Expose port
EXPOSE 3037

##########
# BEWARE #####################################################################################
# With SELinux you need to run chcon -Rt svirt_sandbox_file_t /run/media/<user>/<longid>/home/<user>/Downloads/Backends/DBpedia/jena-fuseki-1.1.1/db/
##############################################################################################

# Can be built with: "docker build -t dbpedia ."

# docker run -it -v /run/media/<user>/<longid>/home/<user>/Downloads/Backends/DBpedia/jena-fuseki-1.1.1/db/:/jena-fuseki-1.1.1/db/ --entrypoint="./fuseki-server" -p 3037:3037 dbpedia --port 3037 --loc db /dbpedia
# RUN ./fuseki-server --port 3037 --loc db /dbpedia is done in run command; need to map  as volume (read-only via :ro)

Note: I haven't merged the Dockerfiles of DBpedia and Freebase like I did for the label services (yet), because the content of the fuseki directory looks different. I don't remember whether I made any significant changes to them, so for now I just keep them as they are.

k0105 commented 8 years ago
##########################
## enwiki Service Image ##
##########################

# Proposed image name: enwiki
# Readme https://github.com/brmson/yodaqa/tree/master/data/enwiki

# Inherit official Java image, see https://hub.docker.com/_/java/
FROM java:8

# Copy fuseki files
ADD ./solr-4.6.0 /solr-4.6.0

# Same as "export TERM=dumb"; prevents error "Could not open terminal for stdout: $TERM not set"
ENV TERM dumb

# Define working directory
WORKDIR /solr-4.6.0/example

# Expose port
EXPOSE 8983

##########
# BEWARE #####################################################################################
# With SELinux you need to run chcon -Rt svirt_sandbox_file_t /run/media/<user>/<longid>/home/<user>/Downloads/Backends/enwiki/solr-4.6.0/example/enwiki/collection1/
##############################################################################################

# Can be built with: "docker build -t enwiki ."

# docker run -it -v /run/media/<user>/<longid>/home/<user>/Downloads/Backends/enwiki/solr-4.6.0/example/enwiki/collection1/:/solr-4.6.0/example/enwiki/collection1/ --entrypoint="java" -p 8983:8983 enwiki -Dsolr.solr.home=enwiki -jar start.jar
# RUN java -Dsolr.solr.home=enwiki -jar start.jar is done in run command; need to map  as volume (read-only via :ro)

Together with my Dockerfile for Yoda that Petr has already added to the main repository a while ago this should be sufficient to run everything in containers.

k0105 commented 8 years ago

I just did the orchestration via Docker Compose:

dbpedia:
  image: dbpedia
  ports:
   - "3037:3037"
  volumes:
   - /run/media/<user>/<longid>/home/<user>/Downloads/Backends/DBpedia/jena-fuseki-1.1.1/db/:/jena-fuseki-1.1.1/db/
  command: ./fuseki-server --port 3037 --loc db /dbpedia

enwiki:
  image: enwiki
  ports:
   - "8983:8983"
  volumes:
   - /run/media/<user>/<longid>/home/<user>/Downloads/Backends/enwiki/solr-4.6.0/example/enwiki/collection1/:/solr-4.6.0/example/enwiki/collection1/
  command: java -Dsolr.solr.home=enwiki -jar start.jar

freebase:
  image: freebase
  ports:
   - "3030:3030"
  volumes:
   - /run/media/<user>/<longid>/home/<user>/Downloads/Backends/jena-fuseki-1.1.1/d-freebase/:/jena-fuseki-1.1.1/d-freebase/
  command: ./fuseki-server --loc d-freebase /freebase

label1:
  image: label-service
  ports:
   - "5000:5000"
  volumes:
   - /run/media/<user>/<longid>/home/<user>/Downloads/Backends/DBpedia/label-lookup-master/:/shared
  command: pypy /label-lookup/lookup-service.py /shared/sorted_list.dat

label2:
  image: label-service
  ports:
   - "5001:5001"
  volumes:
   - /run/media/<user>/<longid>/home/<user>/Downloads/Backends/DBpedia/label-lookup-master/:/shared
  command: pypy /label-lookup/lookup-service-sqlite.py /shared/labels.db

Then simply run docker-compose up and all backends are fired up automatically.

pasky commented 8 years ago

Hi! Thanks for all these contributions.

I'm reopening this issue until they are made more accessible by adding them to the wiki or (probably even better) just the git repo source tree.

k0105 commented 8 years ago

Yeah, I'll have to send a ton of pull requests once I'm done writing.

k0105 commented 8 years ago

Done for now. Might discuss more involved orchestration than Docker Compose at some point, but if/when that happens, it belongs into a dedicated thread.

pasky commented 8 years ago

Hmm, sorry to keep reopening this :) but is this really done? I don't think all of the Dockerfiles here are now in git, or are they?

k0105 commented 8 years ago

What are you looking for, specifically? We have a guy working on Kubernetes, but Yoda does not have many dependencies: You need a container for Yoda itself, which is in the main directory, then a container for DBpedia and Freebase, which is this https://github.com/brmson/yodaqa/blob/master/data/dbpedia/Dockerfile and then you need one for Solr here https://github.com/brmson/yodaqa/blob/master/data/enwiki/Dockerfile and one for the label services that I provided as a pull request to your subrepo here https://github.com/brmson/label-lookup/pull/1 (hasn't been accepted, yet).

In case you're wondering how to create two different containers, e.g. DBpedia and Freebase, out of one image, look here: https://github.com/k0105/DockerAccessories/blob/master/docker-compose.yml

To conclude my example, both dbpedia and freebase use a common fuseki image:

dbpedia:
  image: fuseki
  ports:
   - "3037:3037"
  volumes:
   - /media/fp/DataBackends/data/db/:/jena-fuseki-1.1.1/db/
  command: ./fuseki-server --port 3037 --loc db /dbpedia

freebase:
  image: fuseki
  ports:
   - "3030:3030"
  volumes:
   - /media/fp/DataBackends/data/d-freebase/:/jena-fuseki-1.1.1/d-freebase/
  command: ./fuseki-server --loc d-freebase /freebase

And only their mapped volumes, ports and commands determine which task they fulfill at runtime.

So in short: Yes, I do think you have all Dockerfiles and if not you should be able to find them here: https://github.com/k0105/DockerAccessories

Please let me know if you need anything else.

pasky commented 8 years ago

Thanks - I think we have everything we need in this issue, but I just wanted to point out that not everything is transposed to the source tree, i.e. accessible to whoever comes to the project. I completely missed the label-lookup PR, thanks, merged that now. :)

So AIUI, what is a TODO to fully transpose the content of this issue is:

Does that make sense?

k0105 commented 8 years ago

Sure, I'll respond either later tonight or, because I have two presentations tomorrow, on Thursday and provide everything you need to run Yoda with all dependencies inside Docker with Docker Compose. It's simple and only takes a few minutes - you're gonna like it. Unfortunately, right now I gotta run, so just very briefly some snippets:

The directories with the files:

ls d-freebase
GOSP.dat  GSPO.idn      OSP.dat   POSG.dat       prefixes.dat   SPOG.idn
GOSP.idn  journal.jrnl  OSPG.dat  POSG.idn       prefixIdx.dat  SPO.idn
GPOS.dat  node2id.dat   OSPG.idn  POS.idn        prefixIdx.idn  stats.opt
GPOS.idn  node2id.idn   OSP.idn   prefix2id.dat  SPO.dat        tdb.lock
GSPO.dat  nodes.dat     POS.dat   prefix2id.idn  SPOG.dat
tree enwiki
enwiki
└── collection1
    ├── conf
    │   ├── admin-extra.html
    │   ├── admin-extra.menu-bottom.html
    │   ├── admin-extra.menu-top.html
    │   ├── clustering
    │   │   └── carrot2
    │   │       ├── kmeans-attributes.xml
    │   │       ├── lingo-attributes.xml
    │   │       └── stc-attributes.xml
    │   ├── currency.xml
    │   ├── data-config.xml
    │   ├── data-config.xml~
    │   ├── dataimport.properties
    │   ├── elevate.xml
    │   ├── lang
    │   │   ├── contractions_ca.txt
    │   │   ├── contractions_fr.txt
    │   │   ├── contractions_ga.txt
    │   │   ├── contractions_it.txt
    │   │   ├── hyphenations_ga.txt
    │   │   ├── stemdict_nl.txt
    │   │   ├── stoptags_ja.txt
    │   │   ├── stopwords_ar.txt
    │   │   ├── stopwords_bg.txt
    │   │   ├── stopwords_ca.txt
    │   │   ├── stopwords_cz.txt
    │   │   ├── stopwords_da.txt
    │   │   ├── stopwords_de.txt
    │   │   ├── stopwords_el.txt
    │   │   ├── stopwords_en.txt
    │   │   ├── stopwords_es.txt
    │   │   ├── stopwords_eu.txt
    │   │   ├── stopwords_fa.txt
    │   │   ├── stopwords_fi.txt
    │   │   ├── stopwords_fr.txt
    │   │   ├── stopwords_ga.txt
    │   │   ├── stopwords_gl.txt
    │   │   ├── stopwords_hi.txt
    │   │   ├── stopwords_hu.txt
    │   │   ├── stopwords_hy.txt
    │   │   ├── stopwords_id.txt
    │   │   ├── stopwords_it.txt
    │   │   ├── stopwords_ja.txt
    │   │   ├── stopwords_lv.txt
    │   │   ├── stopwords_nl.txt
    │   │   ├── stopwords_no.txt
    │   │   ├── stopwords_pt.txt
    │   │   ├── stopwords_ro.txt
    │   │   ├── stopwords_ru.txt
    │   │   ├── stopwords_sv.txt
    │   │   ├── stopwords_th.txt
    │   │   ├── stopwords_tr.txt
    │   │   └── userdict_ja.txt
    │   ├── mapping-FoldToASCII.txt
    │   ├── mapping-ISOLatin1Accent.txt
    │   ├── protwords.txt
    │   ├── schema.xml
    │   ├── scripts.conf
    │   ├── solrconfig.xml
    │   ├── spellings.txt
    │   ├── stopwords.txt
    │   ├── synonyms.txt
    │   ├── update-script.js
    │   ├── velocity
    │   │   ├── browse.vm
    │   │   ├── cluster_results.vm
    │   │   ├── cluster.vm
    │   │   ├── debug.vm
    │   │   ├── did_you_mean.vm
    │   │   ├── error.vm
    │   │   ├── facet_fields.vm
    │   │   ├── facet_pivot.vm
    │   │   ├── facet_queries.vm
    │   │   ├── facet_ranges.vm
    │   │   ├── facets.vm
    │   │   ├── footer.vm
    │   │   ├── header.vm
    │   │   ├── head.vm
    │   │   ├── hit_grouped.vm
    │   │   ├── hit_plain.vm
    │   │   ├── hit.vm
    │   │   ├── join_doc.vm
    │   │   ├── jquery.autocomplete.css
    │   │   ├── jquery.autocomplete.js
    │   │   ├── layout.vm
    │   │   ├── main.css
    │   │   ├── mime_type_lists.vm
    │   │   ├── pagination_bottom.vm
    │   │   ├── pagination_top.vm
    │   │   ├── product_doc.vm
    │   │   ├── query_form.vm
    │   │   ├── query_group.vm
    │   │   ├── query_spatial.vm
    │   │   ├── query.vm
    │   │   ├── README.txt
    │   │   ├── results_list.vm
    │   │   ├── richtext_doc.vm
    │   │   ├── suggest.vm
    │   │   ├── tabs.vm
    │   │   └── VM_global_library.vm
    │   └── xslt
    │       ├── example_atom.xsl
    │       ├── example_rss.xsl
    │       ├── example.xsl
    │       ├── luke.xsl
    │       └── updateXml.xsl
    ├── core.properties
    ├── data
    │   ├── index
    │   │   ├── _7l.fdt
    │   │   ├── _7l.fdx
    │   │   ├── _7l.fnm
    │   │   ├── _7l_Lucene41_0.doc
    │   │   ├── _7l_Lucene41_0.pos
    │   │   ├── _7l_Lucene41_0.tim
    │   │   ├── _7l_Lucene41_0.tip
    │   │   ├── _7l.nvd
    │   │   ├── _7l.nvm
    │   │   ├── _7l.si
    │   │   ├── _ay.fdt
    │   │   ├── _ay.fdx
    │   │   ├── _ay.fnm
    │   │   ├── _ay_Lucene41_0.doc
    │   │   ├── _ay_Lucene41_0.pos
    │   │   ├── _ay_Lucene41_0.tim
    │   │   ├── _ay_Lucene41_0.tip
    │   │   ├── _ay.nvd
    │   │   ├── _ay.nvm
    │   │   ├── _ay.si
    │   │   ├── _bi.fdt
    │   │   ├── _bi.fdx
    │   │   ├── _bi.fnm
    │   │   ├── _bi_Lucene41_0.doc
    │   │   ├── _bi_Lucene41_0.pos
    │   │   ├── _bi_Lucene41_0.tim
    │   │   ├── _bi_Lucene41_0.tip
    │   │   ├── _bi.nvd
    │   │   ├── _bi.nvm
    │   │   ├── _bi.si
    │   │   ├── _bs.fdt
    │   │   ├── _bs.fdx
    │   │   ├── _bs.fnm
    │   │   ├── _bs_Lucene41_0.doc
    │   │   ├── _bs_Lucene41_0.pos
    │   │   ├── _bs_Lucene41_0.tim
    │   │   ├── _bs_Lucene41_0.tip
    │   │   ├── _bs.nvd
    │   │   ├── _bs.nvm
    │   │   ├── _bs.si
    │   │   ├── _bv.fdt
    │   │   ├── _bv.fdx
    │   │   ├── _bv.fnm
    │   │   ├── _bv_Lucene41_0.doc
    │   │   ├── _bv_Lucene41_0.pos
    │   │   ├── _bv_Lucene41_0.tim
    │   │   ├── _bv_Lucene41_0.tip
    │   │   ├── _bv.nvd
    │   │   ├── _bv.nvm
    │   │   ├── _bv.si
    │   │   ├── _bw.fdt
    │   │   ├── _bw.fdx
    │   │   ├── _bw.fnm
    │   │   ├── _bw_Lucene41_0.doc
    │   │   ├── _bw_Lucene41_0.pos
    │   │   ├── _bw_Lucene41_0.tim
    │   │   ├── _bw_Lucene41_0.tip
    │   │   ├── _bw.nvd
    │   │   ├── _bw.nvm
    │   │   ├── _bw.si
    │   │   ├── _bx.fdt
    │   │   ├── _bx.fdx
    │   │   ├── _bx.fnm
    │   │   ├── _bx_Lucene41_0.doc
    │   │   ├── _bx_Lucene41_0.pos
    │   │   ├── _bx_Lucene41_0.tim
    │   │   ├── _bx_Lucene41_0.tip
    │   │   ├── _bx.nvd
    │   │   ├── _bx.nvm
    │   │   ├── _bx.si
    │   │   ├── _bz.fdt
    │   │   ├── _bz.fdx
    │   │   ├── _bz.fnm
    │   │   ├── _bz_Lucene41_0.doc
    │   │   ├── _bz_Lucene41_0.pos
    │   │   ├── _bz_Lucene41_0.tim
    │   │   ├── _bz_Lucene41_0.tip
    │   │   ├── _bz.nvd
    │   │   ├── _bz.nvm
    │   │   ├── _bz.si
    │   │   ├── _c2.fdt
    │   │   ├── _c2.fdx
    │   │   ├── _c2.fnm
    │   │   ├── _c2_Lucene41_0.doc
    │   │   ├── _c2_Lucene41_0.pos
    │   │   ├── _c2_Lucene41_0.tim
    │   │   ├── _c2_Lucene41_0.tip
    │   │   ├── _c2.nvd
    │   │   ├── _c2.nvm
    │   │   ├── _c2.si
    │   │   ├── _c3.fdt
    │   │   ├── _c3.fdx
    │   │   ├── _c3.fnm
    │   │   ├── _c3_Lucene41_0.doc
    │   │   ├── _c3_Lucene41_0.pos
    │   │   ├── _c3_Lucene41_0.tim
    │   │   ├── _c3_Lucene41_0.tip
    │   │   ├── _c3.nvd
    │   │   ├── _c3.nvm
    │   │   ├── _c3.si
    │   │   ├── _c4.fdt
    │   │   ├── _c4.fdx
    │   │   ├── _c4.fnm
    │   │   ├── _c4_Lucene41_0.doc
    │   │   ├── _c4_Lucene41_0.pos
    │   │   ├── _c4_Lucene41_0.tim
    │   │   ├── _c4_Lucene41_0.tip
    │   │   ├── _c4.nvd
    │   │   ├── _c4.nvm
    │   │   ├── _c4.si
    │   │   ├── _c6.fdt
    │   │   ├── _c6.fdx
    │   │   ├── _c6.fnm
    │   │   ├── _c6_Lucene41_0.doc
    │   │   ├── _c6_Lucene41_0.pos
    │   │   ├── _c6_Lucene41_0.tim
    │   │   ├── _c6_Lucene41_0.tip
    │   │   ├── _c6.nvd
    │   │   ├── _c6.nvm
    │   │   ├── _c6.si
    │   │   ├── _c7.fdt
    │   │   ├── _c7.fdx
    │   │   ├── _c7.fnm
    │   │   ├── _c7_Lucene41_0.doc
    │   │   ├── _c7_Lucene41_0.pos
    │   │   ├── _c7_Lucene41_0.tim
    │   │   ├── _c7_Lucene41_0.tip
    │   │   ├── _c7.nvd
    │   │   ├── _c7.nvm
    │   │   ├── _c7.si
    │   │   ├── _c8.fdt
    │   │   ├── _c8.fdx
    │   │   ├── _c8.fnm
    │   │   ├── _c8_Lucene41_0.doc
    │   │   ├── _c8_Lucene41_0.pos
    │   │   ├── _c8_Lucene41_0.tim
    │   │   ├── _c8_Lucene41_0.tip
    │   │   ├── _c8.nvd
    │   │   ├── _c8.nvm
    │   │   ├── _c8.si
    │   │   ├── _c9.fdt
    │   │   ├── _c9.fdx
    │   │   ├── _c9.fnm
    │   │   ├── _c9_Lucene41_0.doc
    │   │   ├── _c9_Lucene41_0.pos
    │   │   ├── _c9_Lucene41_0.tim
    │   │   ├── _c9_Lucene41_0.tip
    │   │   ├── _c9.nvd
    │   │   ├── _c9.nvm
    │   │   ├── _c9.si
    │   │   ├── _cc.fdt
    │   │   ├── _cc.fdx
    │   │   ├── _cc.fnm
    │   │   ├── _cc_Lucene41_0.doc
    │   │   ├── _cc_Lucene41_0.pos
    │   │   ├── _cc_Lucene41_0.tim
    │   │   ├── _cc_Lucene41_0.tip
    │   │   ├── _cc.nvd
    │   │   ├── _cc.nvm
    │   │   ├── _cc.si
    │   │   ├── _ce.fdt
    │   │   ├── _ce.fdx
    │   │   ├── _ce.fnm
    │   │   ├── _ce_Lucene41_0.doc
    │   │   ├── _ce_Lucene41_0.pos
    │   │   ├── _ce_Lucene41_0.tim
    │   │   ├── _ce_Lucene41_0.tip
    │   │   ├── _ce.nvd
    │   │   ├── _ce.nvm
    │   │   ├── _ce.si
    │   │   ├── _cf.fdt
    │   │   ├── _cf.fdx
    │   │   ├── _cf.fnm
    │   │   ├── _cf_Lucene41_0.doc
    │   │   ├── _cf_Lucene41_0.pos
    │   │   ├── _cf_Lucene41_0.tim
    │   │   ├── _cf_Lucene41_0.tip
    │   │   ├── _cf.nvd
    │   │   ├── _cf.nvm
    │   │   ├── _cf.si
    │   │   ├── _cg.fdt
    │   │   ├── _cg.fdx
    │   │   ├── _cg.fnm
    │   │   ├── _cg_Lucene41_0.doc
    │   │   ├── _cg_Lucene41_0.pos
    │   │   ├── _cg_Lucene41_0.tim
    │   │   ├── _cg_Lucene41_0.tip
    │   │   ├── _cg.nvd
    │   │   ├── _cg.nvm
    │   │   ├── _cg.si
    │   │   ├── _ch.fdt
    │   │   ├── _ch.fdx
    │   │   ├── _ch.fnm
    │   │   ├── _ch_Lucene41_0.doc
    │   │   ├── _ch_Lucene41_0.pos
    │   │   ├── _ch_Lucene41_0.tim
    │   │   ├── _ch_Lucene41_0.tip
    │   │   ├── _ch.nvd
    │   │   ├── _ch.nvm
    │   │   ├── _ch.si
    │   │   ├── _ci.fdt
    │   │   ├── _ci.fdx
    │   │   ├── _ci.fnm
    │   │   ├── _ci_Lucene41_0.doc
    │   │   ├── _ci_Lucene41_0.pos
    │   │   ├── _ci_Lucene41_0.tim
    │   │   ├── _ci_Lucene41_0.tip
    │   │   ├── _ci.nvd
    │   │   ├── _ci.nvm
    │   │   ├── _ci.si
    │   │   ├── _cj.fdt
    │   │   ├── _cj.fdx
    │   │   ├── _cj.fnm
    │   │   ├── _cj_Lucene41_0.doc
    │   │   ├── _cj_Lucene41_0.pos
    │   │   ├── _cj_Lucene41_0.tim
    │   │   ├── _cj_Lucene41_0.tip
    │   │   ├── _cj.nvd
    │   │   ├── _cj.nvm
    │   │   ├── _cj.si
    │   │   ├── _ck.fdt
    │   │   ├── _ck.fdx
    │   │   ├── _ck.fnm
    │   │   ├── _ck_Lucene41_0.doc
    │   │   ├── _ck_Lucene41_0.pos
    │   │   ├── _ck_Lucene41_0.tim
    │   │   ├── _ck_Lucene41_0.tip
    │   │   ├── _ck.nvd
    │   │   ├── _ck.nvm
    │   │   ├── _ck.si
    │   │   ├── _cl.fdt
    │   │   ├── _cl.fdx
    │   │   ├── _cl.fnm
    │   │   ├── _cl_Lucene41_0.doc
    │   │   ├── _cl_Lucene41_0.pos
    │   │   ├── _cl_Lucene41_0.tim
    │   │   ├── _cl_Lucene41_0.tip
    │   │   ├── _cl.nvd
    │   │   ├── _cl.nvm
    │   │   ├── _cl.si
    │   │   ├── _cm.fdt
    │   │   ├── _cm.fdx
    │   │   ├── _cm.fnm
    │   │   ├── _cm_Lucene41_0.doc
    │   │   ├── _cm_Lucene41_0.pos
    │   │   ├── _cm_Lucene41_0.tim
    │   │   ├── _cm_Lucene41_0.tip
    │   │   ├── _cm.nvd
    │   │   ├── _cm.nvm
    │   │   ├── _cm.si
    │   │   ├── _cn.fdt
    │   │   ├── _cn.fdx
    │   │   ├── _cn.fnm
    │   │   ├── _cn_Lucene41_0.doc
    │   │   ├── _cn_Lucene41_0.pos
    │   │   ├── _cn_Lucene41_0.tim
    │   │   ├── _cn_Lucene41_0.tip
    │   │   ├── _cn.nvd
    │   │   ├── _cn.nvm
    │   │   ├── _cn.si
    │   │   ├── segments_5t
    │   │   └── segments.gen
    │   └── tlog
    │       └── tlog.0000000000000000206
    └── README.txt
ls labels
labels.db  sorted_list.dat
ls db
GOSP.dat  GSPO.idn      OSP.dat   POSG.dat       prefixes.dat   SPOG.idn
GOSP.idn  journal.jrnl  OSPG.dat  POSG.idn       prefixIdx.dat  SPO.idn
GPOS.dat  node2id.dat   OSPG.idn  POS.idn        prefixIdx.idn  stats.opt
GPOS.idn  node2id.idn   OSP.idn   prefix2id.dat  SPO.dat        tdb.lock
GSPO.dat  nodes.dat     POS.dat   prefix2id.idn  SPOG.dat

The Docker Compose YAML:

dbpedia:
  image: fuseki
  ports:
   - "3037:3037"
  volumes:
   - /media/fp/DataBackends/data/db/:/jena-fuseki-1.1.1/db/
  command: ./fuseki-server --port 3037 --loc db /dbpedia

enwiki:
  image: solr
  ports:
   - "8983:8983"
  volumes:
   - /media/fp/DataBackends/data/enwiki/collection1/:/solr-4.6.0/example/enwiki/collection1/
  command: java -Dsolr.solr.home=enwiki -jar start.jar

freebase:
  image: fuseki
  ports:
   - "3030:3030"
  volumes:
   - /media/fp/DataBackends/data/d-freebase/:/jena-fuseki-1.1.1/d-freebase/
  command: ./fuseki-server --loc d-freebase /freebase

label1:
  image: labels
  ports:
   - "5000:5000"
  volumes:
   - /media/fp/DataBackends/data/labels/:/shared
  command: pypy /label-lookup/lookup-service.py /shared/sorted_list.dat

label2:
  image: labels
  ports:
   - "5001:5001"
  volumes:
   - /media/fp/DataBackends/data/labels/:/shared
  command: pypy /label-lookup/lookup-service-sqlite.py /shared/labels.db

webqa:
  image: webqa
  ports:
   - "4000:4000"
  volumes:
   - /media/fp/DataBackends/00keys/webqa:/qaservice/conf
  command: ./gradlew runRestBackend

yoda_offline:
  image: yoda_offline_tested
  links:
   - enwiki:enwiki
   - dbpedia:dbpedia
   - freebase:freebase
   - label1:label1
   - label2:label2
  ports:
   - "4567:4567"
  command: ./gradlew web -q -Dcz.brmlab.yodaqa.dbpediaurl="http://dbpedia:3037/dbpedia/query" -Dcz.brmlab.yodaqa.freebaseurl="http://freebase:3030/freebase/query" -Dcz.brmlab.yodaqa.solrurl="http://enwiki:8983/solr" -Dcz.brmlab.yodaqa.label1url="http://label1:5000" -Dcz.brmlab.yodaqa.label2url="http://label2:5001"
k0105 commented 8 years ago

I have added the docker-compose file and a README about Docker to the data subdirectory, since they are easy to remove if you don't like them (I will still keep sending formal requests if I propose code changes instead of arbitrarily messing up your code). They should fulfill your requirements. Also, I pushed the images to Docker Hub, so once the data directory has been prepared, one should simply be able to adapt the path in the Docker Compose file and run everything via docker-compose up.

[I have one private image free, so I will probably add the webqa and ensemble soon.]

pasky commented 8 years ago

Awesome, thank you! I added some pointers to the main README too.