cfpb / regulations-site

(DEPRECATED) Web interface for viewing U.S. federal regulations and other regulatory information
Other
28 stars 43 forks source link

Add long-term caching #671

Closed cmc333333 closed 9 years ago

cmc333333 commented 10 years ago

Many of our pages take a second or more to load, yet change very rarely. Add an eregs-specific cache which will store these pages for 15 days. Default to a file-system solution so we can easily clear it when new data is present.

On a pre-caching script which loads several hundred pages, this cache caused a 40x speedup.

cmc333333 commented 10 years ago

Note that is requires modifying settings (if the base are not used, or are over written).

For development, you probably want:

    'eregs_longterm_cache': {
        'BACKEND': 'django.core.cache.backends.dummy.DummyCache'
    },

and in production:

    'eregs_longterm_cache': {
        'BACKEND': 'django.core.cache.backends.filebased.FileBasedCache',
        'LOCATION': '/tmp/eregs_longterm_cache',
        'TIMEOUT': 60*60*24*15,     # 15 days
        'OPTIONS': {
            'MAX_ENTRIES': 10000,
        },
    },
cmc333333 commented 10 years ago

@rosskarchner You may be the best set of eyes to review this

ascott1 commented 9 years ago

Ping @rosskarchner. We've certainly gotten feedback that the site can feel slow. I'd love to see this in action.

rosskarchner commented 9 years ago

I'm embarrassed that I just found this, but I think it's gonna do a world of good.

ascott1 commented 9 years ago

:+1: no embarrassment necessary. Better late than never!

rosskarchner commented 9 years ago

@ascott1 (and @cmc333333 maybe)-- does this pre-caching script still run? how often?

cmc333333 commented 9 years ago

It used to run off of the demo server, which, I believe was re-imaged -- I don't know if it's still running. It was just a bash or python script that hit a whole bunch of urls on a cron. I believe it ran every hour, but I'm not certain. I think I wrote some internal docs about it.

rosskarchner commented 9 years ago

Thanks, CM!

rosskarchner commented 9 years ago

We are now using this in production!

I'm kind of bummed that the page cache isn't shared between servers (so every URL is theoretically generated twice), but we're still thinking through that. Basically: is the increased efficiency of cache usage worth the overhead of a network mount or some other solution?

ascott1 commented 9 years ago

Whoa, eRegs screeeeaaaaaammmmssss now. Amazing.

khandelwal commented 9 years ago

Yeah, the speed improvements are AWESOME.

ascott1 commented 9 years ago

I'm just clicking ALL THE THINGS.

khandelwal commented 9 years ago

CLICK, CLICK, CLICK.

cmc333333 commented 9 years ago

Yay caching. A goal I've had for a long time is to make a static site generator for this, and the content, by definition, almost never changes. The long-term cache is a middle ground -- no significant rework of the underlying system, yet massive performance boost.

@rosskarchner - if you wanted to share across servers, memcached is probably the answer. The only tricky bit would be to verify timeouts. Luckily, you can configure that all without changing the eregs code base :)

rosskarchner commented 9 years ago

Things may have gotten a bit faster (or I'm imagining things), as these servers are now using SSD's

cmc333333 commented 9 years ago

@ascott1 I revise:

For development, you probably want:

    'eregs_longterm_cache': {
        'BACKEND': 'django.core.cache.backends.dummy.DummyCache',
        'TIMEOUT': 60*60*24*15,     # 15 days
    },