Closed JamesKunstle closed 1 year ago
ADDITIONALLY: use the docker compose dns for everything as well. Shouldn't be referencing resources by static IP anywhere if possible.
Sometimes, the openshift deployed 8Knot instance hangs- no work is able to be done.
I think this is because the Redis container restarts, and because its cluster IP is ephemeral, it is reassigned a new IP. When this happens, the workers cannot discover the new Redis container because they aren't configured to find it via the DNS. If the container were connected to via a Service, the OCP DNS would handle the bookkeeping to keep everything synchronized.
UPDATE:
DNS in docker-compose works great. In a Python program on the same compose network,
cache = Redis(host="redis", port=6379)
is all that's required.
The deployed Redis instance is given a port at random by the operating system, so there's no collision.
This also means that we can launch multiple composes:
docker compose --project-name <name> up
without port conflicts.
I wrote a little toy app that I think should be documented for future reference.
This application uses Docker to containerize Flask applications with distinct IDs that are reported to the user.
Each application is connected to a single Redis instance and can set/get a value in the instance.
docker-compose handles the logical network and the DNS service IP resolution so that the Flask applications can all find the Redis cache by its service name. It also handles scaling, so multiple Flask applications in containers can be available at a given time.
To accommodate multiple applications that handle the same kind of requests, an nginx web server container acts as a reverse proxy and a load balancer that send requests round-robin to the available application servers.
Here's the code and configuration for this applet:
File structure
../compose-testing
├── Dockerfile
├── README.md
├── app.py
├── docker-compose.yaml
└── nginx.conf
docker-compose.yaml
services:
flask-server:
build: .
ports:
- 5001
depends_on:
- redis-cache
reverse-proxy:
image: nginx:latest
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- flask-server
ports:
- "5001:5002"
redis-cache:
image: redis:latest
ports:
- 6379
Dockerfile
FROM python:alpine
WORKDIR /flaskapp
COPY ./app.py .
RUN pip3 install Flask redis
CMD python3 app.py
nginx.config
user nginx;
events {
worker_connections 1000;
}
http {
server {
listen 5002;
location / {
proxy_pass http://flask-server:5001;
}
}
}
app.py
from flask import Flask
import uuid
from redis import Redis
cache = Redis(host="redis-cache", port=6379)
app = Flask(__name__)
app_id = str(uuid.uuid1())
@app.route("/")
def index():
return f"Flask Server {app_id}"
@app.route("/get")
def get():
val = cache.get("key")
return f"got {val}: {app_id}"
@app.route("/set/<value>")
def set(value):
cache.set("key", value)
return f"set {value}: {app_id}"
if __name__ == "__main__":
app.run(port="5001", host="0.0.0.0")
Implements DNS-only service discovery and reverse-proxy load balancing w/ nginx. Will solve #393 when implemented.
One question is whether we're intending to use a docker-compose strategy to allow users to stand up a production-grade instance of the application on bare metal. For the time being, the recommendation should be to do this at one's own discretion, and that container orchestration platforms are the preferred alternative. When a Helm chart is available that'll be more clear.
Next step is to add gunicorn WSGI servers in front of the Flask servers to be more similar to existing 8Knot deployment.
Make logging visible from Flask application:
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
Launch Flask application server threads w/ gunicorn instead of Flask server:
CMD [ "gunicorn", "--bind", ":5001", "app:app"]
Add non-default compose network:
networks:
app-net:
driver: bridge
Connect services to network:
networks:
- app-net
Before I understood namespace by-name DNS discovery of Openshift services I was using the local ephemeral IP of Redis that is injected into downstream pods to connect Celery workers to cache.
Should update this so that discovery is only every done via the DNS, allowing the IPs to be ephemeral w/o consequences.