Update docker project - Githubissues

CyTube / cytube-docker

CyTube Docker image

20 stars 9 forks source link

Update docker project #2

Closed Gelmo closed 6 years ago

Gelmo commented 6 years ago

This is an update to the cytube-docker project which utilizes MariaDB rather than MySQL, as well as a newer version of Alpine. A few extra values from the config template have been replaced with environment variables in an effort to make this more suitable for production. This discreet package.json has been removed in an effort to make this more suitable for development.

Here are the future goals I intend to work toward:

Use certbot to generate certs based on $ROOT_DOMAIN
Give the user the option to skip the installation of MariaDB, which would be a more ideal solution for users who intend to use a remote sql instance.
Update mainline config.template.yaml to use more variables in place of hard-coded values.
Move docker project to mainline.

Gelmo commented 6 years ago

This PR is meant to address issue #1

Gelmo commented 6 years ago

If anyone would like to test this before merging, you can use the following image from Docker Hub:

gelmo/cytube-docker:fresh

And here is a running instance of the "fresh" image:

http://beta.dank.solutions:8888/

This instance was started like so:

docker run -ti -p 8888:8888 -p 1340:1340 -e HTTP_PORT=8888 -e IO_PORT=1340 -e IO_DOMAIN=http://beta.dank.solutions -e ROOT_DOMAIN=beta.dank.solutions -d cytube-docker

Gelmo commented 6 years ago

@calzoneman The secure socket.io section of the template is uncommented for the sake of production builds; should I comment it out in favor of dev work with http? Also, I noticed on my beta instance above that it is still attempting to connect with 8443, even though https is set to false.

calzoneman commented 6 years ago

I haven't reviewed the changes yet, but I have a few thoughts based on your description and comments above.

I agree with getting rid of the duplicated package.json from this repo, but I'm wondering if it makes sense to have this repo dockerize a specific commit of the upstream code, either via an argument to git clone or by adding calzoneman/sync as a submodule at a specific commit. That would allow the image to be updated explicitly as needed instead of having it always pull the latest HEAD which may contain breaking changes that the docker image needs to account for.

A few extra values from the config template have been replaced with environment variables in an effort to make this more suitable for production

CyTube has too much configuration. I'm wondering if it would make more sense for the docker image to provide some sane configuration by default so that it works out of the box for people who don't need heavy customization, and advise power users to create their own image by using FROM cytube and copying their own configuration into place -- especially since some configuration has now been broken out into individual files in conf instead of config.yaml.

Move docker project to mainline.

Do you mean calzoneman/sync? The pull request was originally opened there, but I opted to break it out into a separate repository in the CyTube organization. The docker image builds on top of calzoneman/sync, I don't consider it the same codebase (and it has different maintainership -- for example, if someone asked me about the docker image in IRC, I'd direct them to open an issue here)

The secure socket.io section of the template is uncommented for the sake of production builds; should I comment it out in favor of dev work with http? Also, I noticed on my beta instance above that it is still attempting to connect with 8443, even though https is set to false.

I think when David contributed the image originally in calzoneman/sync#680, he was configuring the image to bind plain HTTP only, and using a reverse proxy in nginx for TLS termination. If I had to start over on CyTube, I'd strongly consider doing it that way from the beginning.

Gelmo commented 6 years ago

Yea it can be based on a specific commit, but then it isn't very useful for dev work without modification. The only thing necessary to build from a specific commit is to change line 11 in scripts/run.sh to the desired commit. Do you want this to be designed for dev work or for production usage? The current Dockerfile sets sane values for running a local test instance. You can run a local test instance by mapping nothing other than the http and socket ports:

docker run -ti -p 8080:8080 -p 1337:1337 -d cytube-docker

And if you have docker set to automatically expose ports, this works:

docker run -ti -d cytube-docker

Those commands will build from git every time it is ran, which is what makes it ideal for dev work. I figured this would be useful for you and for others working on the project, but there's no reason changes can't be made to gear this more toward production usage. The only thing stopping this from being sane for production is that the passwords aren't randomized and certs aren't automated. Anyone using this for production will need to set the domain variables regardless.

Starting with this image in FROM for your project is a great way to automate deployments for production (such as with docker compose), and the current state of the dockerfile is designed with that in mind. I have a production-style instance here, with https, channels work and everything:

https://beta.dank.solutions:8553

The only things that were changed to achieve this was setting a bindmount for the certs dir, adjusting the cert variables to point to said dir, and settings the http/https variables to false/true respectively and IO domain and root domain variables to the proper address. I changed the ports to avoid conflicts on the server it's running from, but without changing the ports it would work just as well.

Regarding the socket io http issue, I've removed his work completely in favor of running from mainline, and if you test the link above you'll see it works great with https defined. I'll look into it later on.

Anyway, my main argument against adjusting this more for production, other than eventually having the option to generate certs and skip MariaDB, is that anyone intending to use this for production will need to make their own changes no matter what. There is no sane configuration out of the box that could be run; no matter what you need to set the domain. Since production users will need to make changes regardless, it seems like it makes more sense to have it built for devs out of the box.

It's your project, so I'll design it however you would like it to be designed. Please let me know your thoughts on what I've said along with what changes you want me to implement and I'll get started on that tomorrow.

calzoneman commented 6 years ago

I think there's some ambiguity about the use case for the image. My assumption based on the prior contributions was that the intention was to produce an image geared towards production. There aren't many people doing development work, and I'll be the first to admit the official installation process is not as simple as it could be. I don't have docker installed on my production hosts nor my primary development machine so I didn't really have a plan in mind for this repo -- a user contributed the first version and I agreed to host it in case someone else might find it useful. If the design of the docker image changes significantly for production vs. development-oriented images, then perhaps it would be more appropriate to branch and publish them under different tags, or some other way to make it clear what the distinction is.

There is no sane configuration out of the box that could be run

My point about the configuration is that I think it'd be nice to take this opportunity to simplify how much someone has to configure, something I've been meaning to do for a while. There are some configuration keys that don't really need to be customizable in Docker, for example, HTTP ports (Docker can already do host<->container port mapping), or the MySQL configuration if the MariaDB server is running inside the same container. There are a bunch more configuration keys for background job intervals, limits, HTTP settings, etc. that I doubt many people change.

Yes, there will still be some configuration required, such as the hostname, API keys, etc.

Gelmo commented 6 years ago

@calzoneman Great points; thank you for reviewing. I just pushed the removal of the extra build-server and changed channel storage to database. I think we should keep this version just in case people want to use it for dev work; thoughts? I'll get started on a new branch intended for production. Do you think the new branch should have MariaDB in the same container or in a separate container? Usually when using Docker for production you would put services in their own container, so I think that would be best. If we're having discreet containers for each service, we should also make a docker-compose file to make deployment easier. Perhaps one for initial deployment and another for updating cytube independently of MariaDB. This brings up some questions:

Single container or discreet services?
If discreet services, should we use a deployment script or docker-compose? Compose would be ideal for production, but some docker-oriented hosts don't have compose available. Are we making the production version for people with their own servers and those with a bit more experience, or for people intending to throw this at a DigitalOcean droplet? I think docker-compose for the initiated would be best, but you would know better than I do regarding the userbase outside of your own instance.
If discreet services, should a separate nginx/webserver container be implemented with compose or leave that up to the users?
MariaDB, MySQL, or another DB system?
What other features would make for a friendlier production experience?
Should the dev version and/or production version use 80 and 443 with the intention of solely using the integrated webserver? This would allow for deployment using just the expose parameter at the bottom of the Dockerfile; it wouldn't be necessary to map the ports as desired when deploying.

calzoneman commented 6 years ago

I think we should keep this version just in case people want to use it for dev work; thoughts? I'll get started on a new branch intended for production.

Ideally we wouldn't want to maintain multiple branches because that duplicates the effort of merging changes -- I imagine both images will share the same base and some setup scripts and only have minor differences in the entrypoint, in which case maybe having two dockerfiles would make sense? Barring that, a branch is fine -- I don't know if GitHub lets you PR directly to a new branch, but I can create one for you.

Do you think the new branch should have MariaDB in the same container or in a separate container? Usually when using Docker for production you would put services in their own container, so I think that would be best.

Yeah, that sounds reasonable. I think the only reason it was done as an all-in-one container previously was as a proof of concept.

Are we making the production version for people with their own servers and those with a bit more experience, or for people intending to throw this at a DigitalOcean droplet?

I think it would be nice to support the latter. A lot of people installing it aren't really experienced linux admins, just folks trying to set up a private server for their community. The installation process kind of sucks and I often get emailed or contacted on IRC by people struggling with various steps of the process, so it'd be handy to be able to say "if you just want to get a minimal server running, use these basic instructions". People have asked about a 1-click installer before.

As for whether to use compose or not, I'll defer to your judgment on that; I'm not super familiar with docker-compose.

If discreet services, should a separate nginx/webserver container be implemented with compose or leave that up to the users?

Reverse proxying with docker may introduce some complications such as having to adjust the configuration so that CyTube respects X-Forwarded-For headers from whatever internal NAT IP range docker is assigning. We can always add it later.

MariaDB, MySQL, or another DB system?

The software only supports MariaDB and MySQL right now; personally, I use MariaDB.

What other features would make for a friendlier production experience?

You may want to configure and expose Prometheus for monitoring. Prometheus and Grafana (the dashboard I use for CyTube) are both docker images as well, so it might be neat to include those.

I assume that you'd want to bind-mount the log files so you don't lose your prod logs every time you restart the container.

There are also a couple of console commands that I use occasionally via servcmd.sh.js which communicates via a UNIX socket (contributed by @Xaekai) -- exposing those through docker somehow would be nice.

Should the dev version and/or production version use 80 and 443 with the intention of solely using the integrated webserver? This would allow for deployment using just the expose parameter at the bottom of the Dockerfile; it wouldn't be necessary to map the ports as desired when deploying.

Seems reasonable.

Gelmo commented 6 years ago

Ideally we wouldn't want to maintain multiple branches because that duplicates the effort of merging changes -- I imagine both images will share the same base and some setup scripts and only have minor differences in the entrypoint, in which case maybe having two dockerfiles would make sense? Barring that, a branch is fine -- I don't know if GitHub lets you PR directly to a new branch, but I can create one for you.

What would be even better is implementing dev and production deployment in a single dockerfile, along with a variable switch for defining your use case when deploying. That can definitely all be done in a single Dockerfile/branch. Where the other branches come into place is for automated builds on Docker Hub, Docker Cloud, and other CI solutions. That being said, automation can be defined by directory rather than branch, and It's not uncommon for projects with many use cases to do so for managing multiple "tags" (docker branches) in a single repo. Here's a good example:

https://github.com/docker-library/mariadb

Each of the directories there are different tags for deployment. Deploying MariaDB 10.2 is as easy as setting the build image to mariadb:10.2. Perhaps this would be the best way to proceed?

Yeah, that sounds reasonable. I think the only reason it was done as an all-in-one container previously was as a proof of concept. I think it would be nice to support the latter. A lot of people installing it aren't really experienced linux admins, just folks trying to set up a private server for their community. The installation process kind of sucks and I often get emailed or contacted on IRC by people struggling with various steps of the process, so it'd be handy to be able to say "if you just want to get a minimal server running, use these basic instructions". People have asked about a 1-click installer before.

I completely agree. Assuming the ultimate goal is different Dockerfiles for different use cases, it would be very easy to provide both, anyway. Using multiple containers requires using docker-compose or another orchestration tool (I'll absolutely be making a compose.yaml for this in the near future), so while it is the best solution, it isn't feasible for all users. For example, users who want to throw it onto the cheapest VPS or "droplet" they can get their hands on may not be allotted more than 1 container. I do expect the majority of those using this for production to use a separate container for sql, though. I just thought it was worth mentioning that there may be use cases for a single-container solution in production rather than for development. Also worth mentioning that NOT including a sql server in the cytube container could be even more useful for those using this for development; they would be able to redeploy much more quickly. I suppose I shouldn't have asked about this, since I will likely be implementing both options for both use cases. I will prioritize discreet containers for production usage this week, though, as I agree that it will be the most common and desirable use case.

The current image in this pull IS a 1-click installer, and once I've made a compose file for multiple containers, that will be as well.

Reverse proxying with docker may introduce some complications such as having to adjust the configuration so that CyTube respects X-Forwarded-For headers from whatever internal NAT IP range docker is assigning. We can always add it later.

Great point. There is a great nginx image that automates mapping for reverse proxies whenever a container is started, so I'll eventually implement that change to go with that unless you beat me to it.

You may want to configure and expose Prometheus for monitoring. Prometheus and Grafana (the dashboard I use for CyTube) are both docker images as well, so it might be neat to include those.

Can you add me to this repo so I can map things out on the project tracker? I had already intended on making a rancher-oriented cluster which provides the same features, and I think using separate solutions like Prometheus and Grafana could be much better and more lightweight (as well as more "friendly" for non-commercial usage). Perhaps there should be different tags for different orchestration/monitoring solutions. Thanks to most of them having well-documented Docker containers, it would be relatively trivial to maintain multiple solutions. The project tracker will make it much easier to manage all of this, as I'm now realizing there are many more goals to work toward than I had initially foreseen.

I assume that you'd want to bind-mount the log files so you don't lose your prod logs every time you restart the container.

Good point; I feel silly for not including that in the README, especially since the current design is for dev.

There are also a couple of console commands that I use occasionally via servcmd.sh.js which communicates via a UNIX socket (contributed by @Xaekai) -- exposing those through docker somehow would be nice.

Definitely another thing to place on the project tracker.

Gelmo commented 6 years ago

I've just reread most of this, and I think I've come up with an outline that we can agree on. Separate the different use cases by tags, having either 2 or 4. The mainline tag will be "latest", which is the default when pulling without specifying a tag (i.e. cytube/cytube-docker), and the other tag will be dev (i.e. cytube/cytube-docker:dev). The current state of this container becomes the dev tag, and the mainline tag will be the production-ready version. The difference between 2 and 4 tags would be if we decide to use a switch/variable to determine whether or not to install MariaDB, versus having separate tags for each option (i.e. latest, latest-sql, dev, and dev-sql).

Let me know your thoughts on this. If you agree that this could work and that the current image is suitable for dev work with the current state of the mainline cytube repo, let me know if you want me to move this into a directory titled dev, or if I should make a PR for a new branch titled dev.

For the production tags, do you want it to pull a tarball or pull from git at a specific commit? Please also let me know the archive/commit that you would like me to use for this,

calzoneman commented 6 years ago

What would be even better is implementing dev and production deployment in a single dockerfile, along with a variable switch for defining your use case when deploying [...]

Sure, any of those seem reasonable. I just thought branching seemed a little heavy-handed since I assumed Docker would have some way to support this -- I didn't have a specific way in mind.

I do expect the majority of those using this for production to use a separate container for sql, though. I just thought it was worth mentioning that there may be use cases for a single-container solution in production rather than for development

I think the best solution for getting it running without an external db instance would be for me to get off my butt and support SQLite3 ;)

Perhaps there should be different tags for different orchestration/monitoring solutions.

I mentioned Prometheus because CyTube exposes its own in-process monitoring that way, but for monitoring process-external container stats, yeah it may make sense to integrate with whatever Docker monitoring solutions exist.

I've just reread most of this, and I think I've come up with an outline that we can agree on. Separate the different use cases by tags, having either 2 or 4. [...]

I think the previous maintainer tried to configure this repo to autobuild to Dockerhub cytube/cytube but I don't know how that's set up exactly -- in any case, it doesn't seem to have ever built anything. As long as the relevant scripts exist in git, at least someone can manually build it while the release model gets worked out.

For the production tags, do you want it to pull a tarball or pull from git at a specific commit? Please also let me know the archive/commit that you would like me to use for this,

I don't have a specific commit in mind, I just think the docker release should be pinned at a known-working version and bumped manually when the maintainer knows the new CyTube commits aren't going to bust the docker image. Maybe the CI handles this automatically, I dunno. Ideally I would tag releases in calzoneman/sync and docker would follow those.

calzoneman commented 6 years ago

Hopefully you should have access to the project tracker now.

calzoneman commented 6 years ago

Oh, I see, the autobuild is on the dockerhub side.

Gelmo commented 6 years ago

I'm biased to see it as a fresh opportunity to create a new, alternative installation/configuration mechanism that doesn't carry the baggage of years of backwards compatibility and legacy features for existing installations.

I definitely understand that. Is there a new repo that I should/can use as a base for a future version? There still isn't really much I could do in that regard with the current state; any changes regarding doing away with the config file I would push to cytube rather than cytube-docker.

On this note, I think a Dockerfile/image for the new generation or configuration implementation should be prioritized once you have it upstream (perhaps in a new branch, or one of the other repos here (not sure what the next gen structure is)), for the sake of quick tests.

I think the best solution for getting it running without an external db instance would be for me to get off my butt and support SQLite3 ;)

I don't disagree, and that's definitely a feature I'll implement in the Dockerfile once you've implemented it upstream. In the interim however, offering the choice will be trivial so I'll proceed with doing so.

I think the previous maintainer tried to configure this repo to autobuild to Dockerhub cytube/cytube but I don't know how that's set up exactly -- in any case, it doesn't seem to have ever built anything. As long as the relevant scripts exist in git, at least someone can manually build it while the release model gets worked out.

I'm not sure how Hub worked at the time, but the way it works now when creating an automated build repo is that it will create a new build whenever a branch is pushed to, even if there haven't been changes. Assuming that it is configured that way, I could get an image there by starting a new branch here. In order to manually set it to auto build for changes within a directory rather than branch (for example, if placing dev and production use cases in the same branch), however, I'll need access to the repo on Hub, either via that cytube account or an organization. An organization would be ideal. Do you have access to that account, or should I reach out to the previous maintainer?

I don't have a specific commit in mind, I just think the docker release should be pinned at a known-working version and bumped manually when the maintainer knows the new CyTube commits aren't going to bust the docker image. Maybe the CI handles this automatically, I dunno. Ideally I would tag releases in calzoneman/sync and docker would follow those.

Yea I'll have a dev version that builds from git head, and the default version will be stable to build from the latest version on the release page. The CI at Hub does handle that; images do not get pushed for public pulling unless the build passes. That being said, due to the fact that we're pulling from git when the container is started, rather than built, a passing build isn't necessarily functional. I'll look into other implementations for the stable version later today and/or tomorrow which could make the CI result a better barometer.

calzoneman commented 6 years ago

I think you have push access on GitHub, so you should be able to create the new branches. Let me know if not.

Gelmo commented 6 years ago

Yes, I do. Work got really busy this week and I needed to step away from other projects. I'm on weekend time now so I'll have a dev and stable pushed within the next 24 hours. Do you have access to the Docker cytube account so I/you can setup auto triggering for each directory?

calzoneman commented 6 years ago

No rush. I'll go ahead and close this PR since it's not necessary for ongoing development. I'll look into the docker hub auto trigger later.

Gelmo commented 6 years ago

Hey @calzoneman

I'm reopening this, so I can continue pushing the changes you suggested to my repo and pull the commit history more easily. I got distracted by some other projects but I'm committed to getting a stable release pushed out this weekend. The latest release in the mainline Sync repo is over one year old; please let me know a commit I should pull from as stable, or let me know if you intend on pushing a stable tarball some time soon. Many significant changes since May 2017. Should I just push the production-style dockerfile/image building from the git head, or from the current commit specifically?

calzoneman commented 6 years ago

Hi,

What I meant by closing the pull request is that you have push access now so you can create a development branch for the work you're doing to avoid churn. I think we've already discussed the core principles at length and it's not necessary for me to review each individual commit; feel free to open a PR to merge a feature branch when some significant functionality has been changed that warrants a second opinion.

You can start with the current git HEAD -- I will try to fix up the versioning/release tagging situation soon.