Closed MikeTheCanuck closed 6 years ago
At the very least we have one image that is the official PostgreSQL image (obtained from the Docker Store, which is a registry!) augmented with the PostGIS packages. I'm assuming we'll also end up with a GeoDjango image, and probably an image to serve up whatever the front end needs that doesn't come from a CDN.
In general, images are a way to standardize software - make sure all the developers have the same setup, rather than having to spend time troubleshooting environments, versions, etc.
Thanks Ed. Happy you’ve found something that is working for the current local development needs. I’m not seeing specific needs on the cloud side for which Docker/containerisation of the data layer is an obvious solution. It sounds like an interesting experiment, and one particular implementation approach - I can’t say whether containerising the data layer is necessarily the best approach CI/CD to AWS.
In last year’s experience just containerising the API layer, we heard some significant pushback from developers that the download latency of grabbing the whole image didn’t present an advantage over just running the API layer directly.
So long as we focus on automating the schema creation/evolution and data import steps, I believe that we will make great strides in improving the developer’s experience this season.
I wasn't saying put the data on images - just that we need a way to manage the images as images, either with an automated build (pull from GitHub) like Docker Hub, or push to a registry via Travis CI. Do we have that now? Do we have a way to track changes in the underlying base images?
I have three images that are more or less stable in my opinion. There's no reason I can't post them to Docker Hub except that I don't want to deal with users outside of Hack Oregon. ;-)
I have some sizing info on the current crop of Data Science Pet Containers images and it's not particularly good news:
REPOSITORY TAG IMAGE ID CREATED SIZE
postgis latest 1a16dad00d02 20 hours ago 1.82GB
amazon latest 656af589d0f3 20 hours ago 1.2GB
rstats latest 74d8e0dce816 20 hours ago 1.83GB
jupyter latest f0271f199922 21 hours ago 3.82GB
That is a total of 8.67 GB. I took a look at the Amazon container registry pricing and it looks like it's $0,09 per gigabyte download. So every time someone downloads the whole stack it costs 78 cents.
That's not a lot, but I can't predict how often people will be doing this. There's no reason we can't host these in free public repositories for open source projects except having to deal with users outside of Hack Oregon.
@znmeb in issue #3 said:
@znmeb, can you tell me more about what problem a Docker registry is solving for you or your team? I know how they work in general, and we're using them in our production-ization of last season's work - I'd like to understand the current problem you're trying to solve by publishing pre-baked images to a registry. Thanks!