MDN migration - Githubissues

bookshelfdave commented 7 years ago

This is a meta-issue covering the MDN migration to AWS. I'm going to leave this issue open until further notice.

Analysis and recommendation

Remaining tasks:

Alias migration
MySQL backups
EFS Replication and Backup
- we're going single region for our first iteration, so EFS replication to other regions is lower in priority.
Sample migration
mdn.mozillademos.org
Elasticsearch hosting

escattone commented 7 years ago

Here are some thoughts on tasks from me:

How will we deploy updates to Kuma, KumaScript, MySQL, etc?
Create an AWS ElasticCache instance in Redis mode for use as the Celery message broker
Create an AWS ElasticCache instance in Redis or Memcached mode for use as an in-memory cache
S3 or Persistent Volume (PV) for attachments, downloads, legacy, and legacy samples?
Use Helm chart for deploying/managing MySQL but with the MDN-custom MySQL image that uses the custom collation. We can use this to deploy two MySQL instances (primary and replica). We need to create a PV/StorageClass which can be claimed (PVC) for this.
- How big should the PVC for each MySQL node? Should consider needs to support multiple backups (25GB per backup).
Create a LB service that fronts the MySQL instances?
Create a Kubernetes Jinja2 template for Kuma (service, deployment, and persistent volume claim) -- this can be used to deploy Kuma, the Kuma API, and the Celery workers
Create a Kubernetes Jinja2 template for KumaScript (both service and deployment -- no need for persistent storage, as all storage can remain ephemeral in the KumaScript container)
Add redirects to Django in Kuma
Add headers to specific Django requests in Kuma

bookshelfdave commented 7 years ago

ping @escattone @jgmize @jwhitlock

How will we deploy updates to Kuma, KumaScript, MySQL, etc?

We can specify a specific version of the image in the K8s deployment/helm chart for MySQL.
We'll have to investigate K8s image pull strategies for Kuma/Kumascript.

Create an AWS ElasticCache instance in Redis mode for use as the Celery message broker

We have some Terraform to do this for us already for Snippets.

Create an AWS ElasticCache instance in Redis or Memcached mode for use as an in-memory cache

The same Terraform above can be used to create either Redis or Memcached clusters.

S3 for attachments, downloads, legacy, and legacy samples

We'll use EFS as it's "just a filesystem" and requires no modifications to the existing Kuma source.
- Here's our recommendation.
- https://github.com/mozmeao/infra/issues/183#issuecomment-296221977
  - this issue has a lot of EFS information, such as how to setup in PV/PVC in k8s.
- Kuma/Django will serve all static assets.

S3 or Persistent Volume (PV) for static assets (CSS, JS, fonts, etc.)?

If we use S3 for static assets we will have to build that into the static pipeline, whereas a PVC requires no extra work on the Django side.

See answer above.

Use Helm chart for deploying/managing MySQL but with the MDN-custom MySQL image that uses the custom collation. We can use this to deploy two MySQL instances (primary and replica). We need to create a PV/StorageClass which can be claimed (PVC) for this.

I can't remember if we've already modified the MySQL helm chart, but we can fork and allow a custom image name to be used (and submit those changes upstream).

I think the PV/PVC should be EBS instead of EFS in this case.

How big should the PVC for each MySQL node? Should consider needs to support multiple backups (25GB per backup).

gp2 EBS (ssd) can be pretty expensive, so we should try and use cheaper long-term storage options if we can. Maybe this is where we use Kubernetes StoageClasses. Additionally, database backups can easily be stored on S3.

See also:

Create a LB service that fronts the MySQL instances?

~We may be able to a Route53 Traffic Policy with Failover Rules. I think MySQL backup/restore/clustering requires it's own GH issue.~

I'm not sure how to do this yet in the context of Kubernetes.

Create a Kubernetes Jinja2 template for Kuma (service, deployment, and persistent volume claim) -- this can be used to deploy Kuma, the Kuma API, and the Celery workers

Here's a link to the Jinja2 code for mdn-dev.

Create a Kubernetes Jinja2 template for KumaScript (both service and deployment -- no need for persistent storage, as all storage can be remain ephemeral in the KumaScript container)

Add redirects to Django in Kuma

These have already been implemented in the following PRs: https://github.com/mozilla/kuma/pull/4231 https://github.com/mozilla/kuma/pull/4220

Add headers to specific Django requests in Kuma

Anything httpd related is covered in these issues:

[Analysis of current SCL3 Apache config, and possible conversion to Python] (https://github.com/mozmeao/infra/issues/180)
convert MDN robots.txt to a Django view [tracking]
Analysis of MDN httpd icon configuration
Analysis of accepted locates in Apache vs Django
plan for migrating mdn.mozillademos.org

escattone commented 7 years ago

Thanks for your comments @metadave! Regarding the MySQL helm chart, the latest version (which was released a short while back) accepts an imageTag (which I tried a while back, and it worked great).

bookshelfdave commented 7 years ago

@escattone I saw the imageTag as well, but I was thinking that we might actually need to use a different image name as well.

escattone commented 7 years ago

@metadave Sorry, I wrote one thing but I meant another! I meant to say that as of version 0.2.6 of the MySQL Helm chart in the stable repo, it supports image as well as imageTag (https://github.com/kubernetes/charts/blob/master/stable/mysql/templates/deployment.yaml#L33). I deployed the current dev MySQL (manageable-puffin-mysql in the mdn-dev namespace) using the image tag a while back.

escattone commented 7 years ago

The MySQL Helm chart looks like it's setup to request storage using the name of a StorageClass. As @metadave said, we should use EBS instead of EFS for MySQL's storage needs, so I'm wondering if we can use the K8s dynamic provisioning in this case?

I'm going to experiment with this StorageClass as it seems (from my current understanding of the K8s docs) that when a PVC is made (specifying the name of the StorageClass) K8s will automatically create (using the built-in provisioner kubernetes.io/aws-ebs?) a PersistentVolume configured to use AWS EBS as specified (io1 type, with a IOPS/GB ratio of 10, which is the top of the range 3:1 to 10:1 that AWS recommends for MySQL):

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: fast
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io1
  zones: us-east-1d, us-east-1c
  iopsPerGB: "10"

I'm not sure about the zones.

bookshelfdave commented 7 years ago

the Virginia cluster is multi-AZ, so zones could be us-east-1b, us-east-1c, us-east-1d

bookshelfdave commented 7 years ago

@escattone I imaging we'll be discussing MySQL quite a bit, so I've created a card to track it.

bookshelfdave commented 7 years ago

I'm going to close this issue, as I feel the cards in the backlog and queued columns represent the work that needs to be completed.

mozmeao / infra

MDN migration #253

Remaining tasks: