nextml / NEXT

NEXT is a machine learning system that runs in the cloud and makes it easy to develop, evaluate, and apply active learning in the real-world. Ask better questions. Get better results. Faster. Automated.
http://nextml.org
Apache License 2.0
160 stars 54 forks source link

Clean up Dockerfile / dependencies / environments #191

Closed dconathan closed 7 years ago

dconathan commented 7 years ago

It always kind of bothered me that the backend, workers and mongodb all used the same docker image - seems very messy and defeats the point of docker (for example, workers don't need mongo installed, just pymongo, and likewise mongodb doesn't need python). I believe the only reason for this was as a shortcut/so the mongo_backup could run the backup script, but we're removing that as per #168?

I also ran into some issues here at amfam with the OLD version of pip that is in the ubuntu:14.04 repos (updating it wouldn't even work - something to do with the firewall)... so I decided to take this on.

I changed it so that the mongodb container just uses the mongo dockerhub image, and my backend/worker Dockerfile looks like:

FROM python:2
MAINTAINER Devin Conathan, ***@***.com

# Install python dependencies for next_backend
ADD requirements.txt /requirements.txt
RUN pip install -r requirements.txt

Super simple and easier to maintain. We could even freeze things at more specific versions (which we probably should be doing anyway). All tests pass with my setup though cadvisor isn't working properly but seems like a simple fix once I diagnose it.

Thoughts?

erinzm commented 7 years ago

This is definitely the Right Way™ to do this. I really hope we can get this in master. It would also be cool if we could use -alpine containers since they're far smaller/lighter than the Ubuntu/Debian-based ones.

dconathan commented 7 years ago

FYI the cadvisor box is related to https://github.com/google/cadvisor/issues/1556 - I think it's because the mongo dockerhub image is built on busybox. We should either use a different mongo docker image or wait for this bug to get fixed.

@liamim re: alpine containers from what I've heard it's not worth the extra maintenance once you are doing stuff with e.g. numpy - once you add all the tools/libs necessary to compile, the alpine image is about the same size as a debian image with numpy.

lalitkumarj commented 7 years ago

I approve.

On Mon, Jun 26, 2017 at 5:41 PM, dconathan notifications@github.com wrote:

FYI the cadvisor box is related to google/cadvisor#1556 https://github.com/google/cadvisor/issues/1556 - I think it's because the mongo dockerhub image is built on busybox. We should either use a different mongo docker image or wait for this bug to get fixed.

@liamim https://github.com/liamim re: alpine containers from what I've heard it's not worth the extra maintenance once you are doing stuff with e.g. numpy - once you add all the tools/libs necessary to compile, the alpine image is about the same size as a debian image with numpy.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nextml/NEXT/issues/191#issuecomment-311190317, or mute the thread https://github.com/notifications/unsubscribe-auth/ABWhGG3Xkqz_DluoEKQMI5oSyllv6nqwks5sICWDgaJpZM4N7gUC .

erinzm commented 7 years ago

@dconathan that's a fair point! in any case, having up-to-date libraries in the apt repositories would be nice.

erinzm commented 7 years ago

This was fixed by #195; closing.