theforeman / foreman-infra

Puppet modules and scripts to manage Foreman project infrastructure
https://theforeman.github.io/foreman-infra
Apache License 2.0
23 stars 51 forks source link

Fastly CDN #476

Closed ehelms closed 4 years ago

mmoll commented 6 years ago

@evgeni What's the plan to proceed here? :)

evgeni commented 6 years ago

Hah, so the plan is "CDN all the things", or, slightly more elaborate:

  1. get @evgeni access to RackSpace to investigate logging to their S3 backend
  2. actually deploy logging to our current endpoints
  3. move more endpoints (probably not deb), monitoring what falls over
  4. investigate if we can have by-hash support in freight so that deb can use better caches, enable it and CDN debian too
mmoll commented 6 years ago

get @evgeni access to RackSpace to investigate logging to their S3 backend

@evgeni got :envelope_with_arrow:.

investigate if we can have by-hash support in freight so that deb can use better caches, enable it and CDN debian too

Unfortunately the answer here is only "pull requests welcome".

ekohl commented 6 years ago

Unfortunately the answer here is only "pull requests welcome".

Or suggest a tool that solves our issues and supports by-hash.

mmoll commented 6 years ago

Or suggest a tool that solves our issues and supports by-hash.

Yes, I wouldn't be against replacing freight with, let's say, pulp-deb.

evgeni commented 6 years ago

Maybe, some day, we could use katello to publish katello, /me dreams.

GregSutcliffe commented 6 years ago

Ohai :)

Can Aptly solve the by-hash function? It's got a decent API, so worth a look. The API supports upload too (I've been using it's API to directly upload packages from my home Jenkins server, no rsync needed) which could simplify our current setup, perhaps.

evgeni commented 5 years ago

I've started actually working on this. The logging config seems to be straight forward, but we need Fastly to enable the S3-based logging for our account (which I did request in a ticket to them).

evgeni commented 5 years ago

Okay, I've got basic logging for stagingdeb working, and would like to discuss a few design things before continuing.

Notes

Current config

cloudfiles:
- name: stagingdeb logging
  access_key: <key>
  bucket_name: fastly
  format: '%h %l %u %t "%r" %>s %b'
  format_version: '2'
  gzip_level: '0'
  message_type: classic
  path: "/stagingdeb/"
  period: '3600'
  placement:
  public_key:
  response_condition: error log
  timestamp_format: "%Y-%m-%dT%H:%M:%S.000"
  user: <user>

Thoughts and questions

@mmoll @ekohl @ehelms @GregSutcliffe what do you think?

ekohl commented 5 years ago

Fastly sends logs from multiple endpoints which results in multiple logfiles created for each time period. My initial idea was to have one bucket for all subdomains, but I think having one bucket per subdomain would be easier.

:+1:

Currently the logs are "rotated" every hour (that's the default), we could increase to 24h, but I don't think this is necessary (we pay for storage and download-traffic only, not per-file).

:+1:

Currently we only log requests that produced http codes 400 to 600. I don't think we need full access logs?

Stats how many people use something could be useful. @GregSutcliffe has looked at downloaded plugins (even though it's not fully accurate).

The logs can be compressed (they aren't at the moment) and encrypted (same). I don't see much value in encryption. Compression I'd like to evaluate for a few days, when we collected a few more logs.

If there are IPs in there we should think about that privacy aspect.

evgeni commented 5 years ago

The logs currently include the clients IP address, yes. We can get rid of them. Or we could replace them with one of the geo-based vars Fastly provides. I didn't see any crypto-pan or similar in the docs.

(I'd probably just drop them, or replace with 0.0.0.0 so that common parsers still can parse the logs)

evgeni commented 5 years ago

I've configured error logging for both downloads and stagingdeb to log in a separate bucket, but did not implement any anonymization yet as I've not heard any :+1: or :-1: for that, and without anonymization I did not like to have full access logs stored for now.

zjhuntin commented 5 years ago

I'd vote for :+1: to anonymization, I think 0.0.0.0 sounds like a plan to me.

ekohl commented 5 years ago

For the RSS feed (different vhost) we do count the individual IPs to gather some statistics about installs. @GregSutcliffe might have some more insight into what kind of things are useful. IMHO country code level logging is pretty useful at least to see where the various installs are located. This can be used to plan meetups.

evgeni commented 5 years ago

Okay, next iteration.

We now have four containers/buckets:

The error logs get... the error logs, non-anonymized The normal logs get the access logs, with the IP replaced by the country code.

ehelms commented 5 years ago

What is left to close this out?

evgeni commented 5 years ago

@ehelms I've just opened https://github.com/theforeman/foreman-infra/pull/1030 which adds the current live config (minus secrets). if that's deemed OK and merged, we can go ahead, add new endpoints using that playbook and switch the next services.

evgeni commented 5 years ago

next step is switching yum.theforeman.org: https://github.com/theforeman/foreman-infra/pull/1041

ehelms commented 5 years ago

1041 was merged

evgeni commented 5 years ago

Yepp, but the DNS isn't changed yet as Ohad was out of town.

ehelms commented 5 years ago

Will that close out this issue? If not, whats next step(s) ?

evgeni commented 5 years ago

I think to fully close this one, I'd at least also move the Debian archive behind the CDN. That should account for all our big endpoints.

evgeni commented 5 years ago

DNS updated, I see traffic, all good :)

evgeni commented 5 years ago

So far Fastly has served 2.5M requests, totalling in ~230GB data. There are ~40k 404 errors which I'll look into in more detail later.

ekohl commented 5 years ago

There is still only a ~10% cache hit ratio. The 404s might have an impact on that, but it does mean we save "only" 23GB out of that 230GB.

It will be interesting to see what happens after a release of Foreman if there is a higher hit ratio due to a lot of upgrading users.

evgeni commented 5 years ago

Quite a bit (~20% of the errors I looked at) of the 404 are against latest/el6 and nightly/el6, which don't exist for quite a while now.

But yeah, the numbers will become interesting when there is a release to be downloaded.

ekohl commented 4 years ago

@evgeni I think we can close this now, right?

evgeni commented 4 years ago

YeS!