Closed sberryman closed 6 years ago
This used to be the case, I'm not sure why it changed.
Sorry, I'm not sure what you mean by "this used to be the case". Are you talking about the removal of APT cache files? I'm pretty sure you only used to be able to get rid of intermediate layers by flattening the image, not very many people do that though.
Update: technically docker calls it -squash
so hopefully I didn't confuse you with 'flatten'
Any idea why libvtk6-dev
header files are getting installed on line 56? They are already included in the massive dependency install phase on line 15.
Want me to take a stab at rewriting it? How condensed do you want the final image? We can run almost everything in a single layer or I can try and split it out a bit. It would help to know what you tend to update. Does the build time of the image matter?
I could install the majority of deps near the top and then install build-essential, cmake, perform build and clean up files on another layer.
This would leave you with apt dependencies in the 1st layer, pip dependencies in 2nd, env & copy on a few more layers and finally build and cleanup on the final layer
I played around with the Dockerfile a bit, here is docker history
for a local build of ODM (96be8c4)
IMAGE CREATED CREATED BY SIZE COMMENT
f4e1f13d0075 Less than a second ago /bin/sh -c #(nop) ENTRYPOINT ["python" "/co… 0B
69e075ebbf62 Less than a second ago /bin/sh -c rm -rf /code/SuperBuild/download … 0B
8712d7541ebc Less than a second ago /bin/sh -c apt-get clean && rm -rf /var/lib/… 0B
64bb9330591e Less than a second ago /bin/sh -c apt-get install -y libvtk6-dev 299MB
d6fe276c3611 Less than a second ago /bin/sh -c apt-get -y remove libgl1-mesa-dri… 3.23MB
be27339e4c81 Less than a second ago /bin/sh -c cd SuperBuild && mkdir build && c… 2.46GB
8695805db5a7 28 minutes ago /bin/sh -c #(nop) COPY file:d2dd9312ac40f85c… 6B
fd3d44ede059 28 minutes ago /bin/sh -c #(nop) COPY file:7c13e4e876f9f1e8… 2.31kB
478225373e32 28 minutes ago /bin/sh -c #(nop) COPY file:de9972b7e00aab74… 3.91kB
77e3f134b1bd 28 minutes ago /bin/sh -c #(nop) COPY dir:0096577dd9215d190… 18.5kB
4a99d26ffff9 28 minutes ago /bin/sh -c #(nop) COPY dir:d612083a96627f68d… 100kB
8ffc941bc51f 28 minutes ago /bin/sh -c #(nop) COPY file:48c2781c0c3cef57… 292B
494c9c556426 28 minutes ago /bin/sh -c #(nop) COPY file:91a496b3597da96a… 1.57kB
fc361d40a3e0 28 minutes ago /bin/sh -c #(nop) COPY dir:68c81018f4833e263… 453kB
38efbc40404a 28 minutes ago /bin/sh -c #(nop) COPY dir:e0e90f8904a9ed21a… 74.8kB
91c2051df250 28 minutes ago /bin/sh -c #(nop) COPY dir:779e102780c9e2f15… 272kB
a66b4bf113bf 28 minutes ago /bin/sh -c #(nop) COPY file:0ee2b2f83c2c5f46… 4.91kB
fe3fc6309d03 28 minutes ago /bin/sh -c #(nop) COPY file:449f3516ab95493f… 547B
cb47c53f7025 28 minutes ago /bin/sh -c #(nop) COPY file:2475dfdfd7a6111c… 654B
a3d14dbfe747 28 minutes ago /bin/sh -c #(nop) WORKDIR /code 0B
5035510c7737 28 minutes ago /bin/sh -c mkdir /code 0B
9858aeeed408 28 minutes ago /bin/sh -c #(nop) ENV LD_LIBRARY_PATH=:/cod… 0B
53a7d71dc731 28 minutes ago /bin/sh -c #(nop) ENV PYTHONPATH=:/code/Sup… 0B
c8ec01695612 28 minutes ago /bin/sh -c #(nop) ENV PYTHONPATH=:/code/Sup… 0B
2c91acaea5aa 28 minutes ago /bin/sh -c pip install -U PyYAML exifread gp… 353MB
5a7e8746e9ae 30 minutes ago /bin/sh -c pip install setuptools 2.85MB
96b6e0695ffb 30 minutes ago /bin/sh -c pip install --upgrade pip 8.22MB
40b6d629abeb 30 minutes ago /bin/sh -c apt-get remove libdc1394-22-dev 0B
91f8de57c68f 30 minutes ago /bin/sh -c apt-get install --no-install-reco… 1.67GB
6fd16cdd33fe 36 minutes ago /bin/sh -c apt-get update -y 686kB
d4d0361edc78 36 minutes ago /bin/sh -c add-apt-repository -y ppa:george-… 6.27kB
076942099b74 36 minutes ago /bin/sh -c add-apt-repository -y ppa:ubuntug… 15.9kB
f6ae1c8f8633 36 minutes ago /bin/sh -c apt-get install software-properti… 0B
63771a799817 36 minutes ago /bin/sh -c apt-get update -y 43.1MB
d0b382d3989c 5 days ago /bin/sh -c #(nop) ENV DEBIAN_FRONTEND=nonin… 0B
166cfc3f6974 6 weeks ago /bin/sh -c #(nop) CMD ["/sbin/my_init"] 0B
<missing> 6 weeks ago /bin/sh -c #(nop) ENV DEBIAN_FRONTEND=telety… 0B
<missing> 6 weeks ago /bin/sh -c /bd_build/prepare.sh && /bd_buil… 97.3MB
<missing> 6 weeks ago /bin/sh -c #(nop) COPY dir:e7a5eda59d878c69c… 39.4kB
<missing> 6 weeks ago /bin/sh -c #(nop) MAINTAINER Phusion <info@p… 0B
<missing> 6 weeks ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 6 weeks ago /bin/sh -c mkdir -p /run/systemd && echo 'do… 7B
<missing> 6 weeks ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$… 2.76kB
<missing> 6 weeks ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0B
<missing> 6 weeks ago /bin/sh -c set -xe && echo '#!/bin/sh' > /… 745B
<missing> 6 weeks ago /bin/sh -c #(nop) ADD file:a3344b835ea6fdc56… 112MB
My changes to the Dockerfile WIP
IMAGE CREATED CREATED BY SIZE COMMENT
c2d1bb8dc4dc 3 minutes ago /bin/sh -c #(nop) ENTRYPOINT ["python" "/co… 0B
0d35e01de91a 3 minutes ago /bin/sh -c apt-get update && apt-get ins… 627MB
1e1467300432 25 minutes ago /bin/sh -c #(nop) COPY file:d2dd9312ac40f85c… 6B
ea975646156d 25 minutes ago /bin/sh -c #(nop) COPY file:7c13e4e876f9f1e8… 2.31kB
729cfdfc1133 25 minutes ago /bin/sh -c #(nop) COPY file:de9972b7e00aab74… 3.91kB
2fe519a788ee 25 minutes ago /bin/sh -c #(nop) COPY dir:0096577dd9215d190… 18.5kB
2e0d942637ba 25 minutes ago /bin/sh -c #(nop) COPY dir:d612083a96627f68d… 100kB
b0eff52163b5 25 minutes ago /bin/sh -c #(nop) COPY file:48c2781c0c3cef57… 292B
614c90583c89 25 minutes ago /bin/sh -c #(nop) COPY file:91a496b3597da96a… 1.57kB
bcdb76b0a87b 25 minutes ago /bin/sh -c #(nop) COPY dir:68c81018f4833e263… 453kB
b875347ab866 25 minutes ago /bin/sh -c #(nop) COPY dir:e0e90f8904a9ed21a… 74.8kB
2efbadd7bd63 25 minutes ago /bin/sh -c #(nop) COPY dir:779e102780c9e2f15… 272kB
fee9b98779e3 25 minutes ago /bin/sh -c #(nop) COPY file:0ee2b2f83c2c5f46… 4.91kB
028ee1152551 25 minutes ago /bin/sh -c #(nop) COPY file:449f3516ab95493f… 547B
35efeee52f01 25 minutes ago /bin/sh -c #(nop) COPY file:2475dfdfd7a6111c… 654B
2270283cdc8f 25 minutes ago /bin/sh -c #(nop) WORKDIR /code 0B
b0f841a6cba8 25 minutes ago /bin/sh -c #(nop) ENV PYTHONPATH=:/code/Sup… 0B
d8e212be8c7e 25 minutes ago /bin/sh -c add-apt-repository -y ppa:ubuntug… 1.82GB
d0b382d3989c 5 days ago /bin/sh -c #(nop) ENV DEBIAN_FRONTEND=nonin… 0B
166cfc3f6974 6 weeks ago /bin/sh -c #(nop) CMD ["/sbin/my_init"] 0B
<missing> 6 weeks ago /bin/sh -c #(nop) ENV DEBIAN_FRONTEND=telety… 0B
<missing> 6 weeks ago /bin/sh -c /bd_build/prepare.sh && /bd_buil… 97.3MB
<missing> 6 weeks ago /bin/sh -c #(nop) COPY dir:e7a5eda59d878c69c… 39.4kB
<missing> 6 weeks ago /bin/sh -c #(nop) MAINTAINER Phusion <info@p… 0B
<missing> 6 weeks ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 6 weeks ago /bin/sh -c mkdir -p /run/systemd && echo 'do… 7B
<missing> 6 weeks ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$… 2.76kB
<missing> 6 weeks ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0B
<missing> 6 weeks ago /bin/sh -c set -xe && echo '#!/bin/sh' > /… 745B
<missing> 6 weeks ago /bin/sh -c #(nop) ADD file:a3344b835ea6fdc56… 112MB
The 1.82 GB of dependencies (APT & pip) is quite large as well as the SuperBuild at 627 MB. I cleaned up a bunch of redundant dependencies (listed more than once on various lines) and moved cleanup of APT and temp files to the RUN commands that create them. I'm assuming there is quite a bit more that I can clean up but I just don't know the project well enough.
Have you guys thought about breaking ODM into a bunch of smaller projects and/or containers? Make each container specific to its process. For instance, OpenSFM should have its own container and use queues, pub/sub or a custom http api to communicate. You are already using docker so having people use compose to build and start a collection of containers is not asking much of the end user. It would also help make the project more (horizontally) scaleable while making it MUCH easier to reduce dependencies down to bare minimum. It is also a bit unnerving that pip dependencies are not version specific and you are not tagging docker builds with a version.
Update: I can open PR if you wan't
@sberryman -- pull requests always welcome!
@smathermather I'm not sure it made a difference on download size, I know I at least cleaned up redundancies.
I'm running a job on the docker image after all the changes to ensure it still works. Do you guys have a pre-defined test(s) you use to verify the results of changes to the Dockerfile or underlying dependencies?
Our tests for ODM itself (as opposed to node-OpenDroneMap or WebODM) are embarrassingly lacking, but usually Dakota or Piero run pull requests through the paces manually.
I have also completely rewritten the dockerfile to conform closer to best practices. Pulled all of Superbuild component builds to seperate builds using latest software so can debug components that were failing and killed Superbuild. Uses debain stretch as base image. Still needs tuning to shrink the image size, but builds without error and produces the orthos I need.
What would be the advantage of pulling all the components apart into their own images beyond updates? See it as complicating the situation.
Can submit a PR of various code bits including dockerfile, but it is strictly for Docker use aka no native install, and nukes the current way the project does things with Superbuild. Requires tuning various other hard coded paths in some of the code.
Here is the gist: https://gist.github.com/PeterSprague/a831e5109eb62fe8a6ab7eef92d2214b
Very interesting @PeterSprague, I don't know the project well enough yet to split it up but it seems like you have at least broken the build out. Are you building each component/sub-project of superbuild as a separate layer? Did you even bother building dependencies you didn't use like ORB_SLAM2? (I guess that could be a bad assumption you didn't use SLAM.)
Based on my experience converting projects at Golf Channel to Docker, we hit a lot of walls and confusion until we broke down the project into several different isolated containers. This let us release and version control each container without having to wait for the entire team to assemble a working build. Scaling became much easier, images got smaller and the overall velocity of development increased dramatically.
The only problem with separate containers is relying on an orchestration framework such as compose, swarm, kubernetes, etc. I can't imagine the number of github issues that would be created by people struggling to get it running.
What was the precursor to this was the desire to use Cuda or OpenCL to speed things up. To make that happen, I need an ODM base image that uses newer/newest OpenCV.
Each piece of software builds on a number of layers, and I have not made any real attempt to compact the image. I diverge from the latest ODM-master in a number of places because I was getting numerous errors and finally decided to nuke the SuperBuild because I was having a hard time debugging issues.
Debian Stretch was chosen over Ubuntu 16.04 because of issues around older versions of software like gdal, and I need access to an arm64 base image with a common set of software for my use case
Thinking about it, being able to scale out some of the components could be a real advantage in a higher volume production situation. Food for thought down the line.
"It is also a bit unnerving that pip dependencies are not version specific"
I was getting "import cv2" --> numpy errors because of this. Also apt-get install various python dependencies mostly python-gdal, and then pip install them in latest odm-master added to the mess.
Be glad to PR my versions of things, but being new to team software development don't want to step on toes with my fork of the ODM code base.
Did you make any progress on the Cuda or OpenCL front? Did you switch to using FAST or ORB_GPU for feature detection and matching?
What other areas are you attempting to use Cuda or CL to speed up the pipeline?
Also with you on submitting a PR for the dockerfile, this was my way to start a discussion. I was hoping to figure out what @dakotabenjamin, @smathermather and @pierotofy had in mind for the roadmap. It would be interesting to know how many people use ODM with/without docker. Personally I don't know why anyone would NOT use docker but I hopped on that bandwagon a while ago.
Issue on the PR is it is more than just a dockerfile, had to tweak other parts of code some so it worked together. Would diverge from master.
Maybe the project could/should look at a Docker branch and a native branch, though that could get messy fast. Leave to others with more experience to give some guidance.
I concur on using Docker except use iocage for BSD environment.
RE: Cuda Intend on pursuing over the next few months. Further complicated by having to port the project to arm64 for my use case.
Original idea was to focus on OpenSfM --> OpenCV, hence moving to OpenCV-v3.4 because it has OpenCL seamlessly included, and more readily supports Cuda.
Having s separate docker and native branch sounds like more maintenance than it's worth, to me. I'd like to see a more thorough explanation of why that would benefit the project more than adjusting the configuration on the master branch to support both.
Only pushed it up as a possibility because I have found the current master structure with SuperBuild and choice of base image not working well for my use with Docker. I also had to tune some of the ODM code to revise hard-coded file/program locations.
As explained above, I needed/wanted to move to the latest software, and found continuing to utilize the SuperBuild format was causing hard to track bugs when building a new image. Docker builds need to be moved to a more micro-service architecture vs monolithic as now. This will require tools like multistage builds, Docker compose link, etc.
Not sure how or if this work should be rolled back into the ODM master, or I can continue with an ODM fork to address ODM Docker usage including Cuda. If its rolled into the ODM master, wouldn't that be a pivot for the project as it stands?
I'd be interested in seeing how we could improve the install process as a whole. It should be a simpler, more robust process, but I'm not exactly sure how that would look.
Really comes back to use cases right?
My primary interest is edge based field use, with server back-end. That leads me to Docker and a mix of amd64 and arm64v8 capabilities. Because of this I choose to use Debian as my ODM base image over Ubuntu-16.04. I found that numerous packages were missing or different between the amd64 and arm64v8 repos in Ubuntu
I'm thinking that at a minimum, to push all the dependencies and supporting software into separate image(s), then strip down the ODM code to its minimum.
I think I understand why the SuperBuild approach was used, but think it is redundant in a docker container. Nuked it for my usage after fighting with impossible to find source build bugs
Will move this more open-ended conversation over to the forum
This issue was moved to http://community.opendronemap.org/t/dockerfile-feedback/369
FYI ONLY!
I noticed a recent commit to the Dockerfile and was quite interested. Then I started digging in to see what the Dockerfile was actually doing.
There are some glaring issues with the below section based on docker caching of layers. When reviewing the best practices for docker images, They state
Always combine RUN apt-get update with apt-get install in the same RUN statement.
The reason is that anyapt-get install
on subsequent layers are using the cachedapt-get update
from previous layers. Meaning if you re-rundocker build
on a machine that has already built ODM and don't make any changes to the layers prior toapt-get update
you'll be using old cached lists and installing (potentially) old versions.https://github.com/OpenDroneMap/OpenDroneMap/blob/96be8c43aa165e5ce727dd6262b88bea106b5c88/Dockerfile#L6-L12
Then there is the issue of cleanup of APT. Running cleanup on a separate layer doesn't remove the intermediate layers which have the cached data. So you either need to remove the cache lists in the same
RUN
command as you are doingapt-get update
andapt-get install
or just remove it from the dockerfile as it is useless. https://github.com/OpenDroneMap/OpenDroneMap/blob/master/Dockerfile#L58-L59