django / djangoproject.com

Source code to djangoproject.com
https://www.djangoproject.com/
BSD 3-Clause "New" or "Revised" License
1.87k stars 943 forks source link

Remove dependency on memcached #1023

Open apollo13 opened 3 years ago

apollo13 commented 3 years ago

We currently have a memcached and a redis cache configured. We should get rid of one software :)

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

carltongibson commented 1 year ago

We should implement this.

pauloxnet commented 1 year ago

I don't know the real infrastructure in which the site is deployed. Which cache we are using in production: Memcached or Redis? Knowing this information we could open a correct PR.

carltongibson commented 1 year ago

@pauloxnet As I read the report, the goal would be to use Redis.

CACHES = {
    'default': {
        'BACKEND': 'django_pylibmc.memcached.PyLibMCCache',
        'LOCATION': SECRETS.get('memcached_host', '127.0.0.1:11211'),
        'BINARY': True,
        'OPTIONS': {
            'tcp_nodelay': True,
            'ketama': True
        }
    },
    '`': {
        'BACKEND': 'redis_cache.RedisCache',
        'LOCATION': SECRETS.get('redis_host', 'localhost:6379'),
        'OPTIONS': {
            'DB': 2,
        },
    },
}

So maybe have the default cache use the built-in Redis backend, and remove the separate docs-pages — so we only have one backend in use (or one service at least).

We should be able to remove a few dependencies at that point.

@apollo13 — any guidance welcome 😉

apollo13 commented 1 year ago

So yes https://github.com/django/djangoproject.com/blob/main/djangoproject/settings/prod.py#L15-L32 lists two caches. I'd argue that one of those can be removed. I think we might wanna remove memcached assuming we are using some special features from redis? So first and foremost it is a question of how we use those backends (code-wise) and then we can see about removing one.

marksweb commented 1 year ago

From a code perspective I'd agree with @carltongibson - drop to a single cache so the docs & site use the same redis db. There isn't that much in the project using cache beyond the basics so a single instance would do it.

Unless of course, the metrics on the infrastructure show's theres a need for a split setup.

However, the built-in redis backend is new in django 4.0 right? And the site is still on 3.2.

On a side note, if the redis connection isn't using SSL it may also be worth looking at that - though that's environment variable based so I can't see if that's already happening. (The SSL connection uses rediss:// on port 6380.)

bmispelon commented 2 months ago

I came across this while doing some cleaning up in the old issues. Now that we're running a non-ancient Django version (4.2 as I type this) we can finally use Django's built-in redis cache backend and I've opened a PR for this (see right above).

There's still a few unknowns that I need to figure out before I can deploy it (see the checklist in the PR) but overall it's looking good.

A few answers to the questions asked above:

Unless of course, the metrics on the infrastructure show's theres a need for a split setup.

-> As far as I can tell, we use two caches so we can purge them separately. The docs cache gets purge when the docs are rebuilt and having a separate one makes that easy.

[...] the redis connection isn't using SSL

-> It is not, but I was under the impression that this was OK because the redis server is running on the same machine as the Django project. Is that a wrong assumption?

marksweb commented 2 months ago

Unless of course, the metrics on the infrastructure show's theres a need for a split setup.

-> As far as I can tell, we use two caches so we can purge them separately. The docs cache gets purge when the docs are rebuilt and having a separate one makes that easy.

That seems a good reason to keep two caches, especially if the impact on the db from a cold cache is significant. But at least it could drop to just using redis and having a separate db for each.

So maybe it ends up being;

CACHES = {
    "default": {
        "BACKEND": "django.core.cache.backends.redis.RedisCache",
        "LOCATION": "redis://127.0.0.1:6379/1",
    },
    "docs-pages": {
        "BACKEND": "django.core.cache.backends.redis.RedisCache",
        "LOCATION": "redis://127.0.0.1:6379/2",
    }
}

[...] the redis connection isn't using SSL

-> It is not, but I was under the impression that this was OK because the redis server is running on the same machine as the Django project. Is that a wrong assumption?

No, that's not a wrong assumption. If that's how it's being operated then that should be ok. I wasn't aware of the setup & I'm used to running on separate instances so thought I'd mention it.