Closed dawnpruitt closed 1 year ago
Lagoon is a container-based application management platform built around a microservices architecture. The microservices approach involves breaking the platform down into individual containers or groups of containers, each responsible for specific tasks or functions. This approach enables each service to be tested, updated, scaled, inspected et cetera independently, which improves scalability, flexibility, and maintainability.
The Lagoon platform is divided into two major components: Core and Remote.
The Core component is responsible for managing critical services such as the API, authentication, and external communication. The Core component can serve multiple Remote instances (and will, in our case). It is recommended to be installed on a separate Kubernetes cluster from the Remote instances. This separation ensures better isolation and stability of the core functionality.
The Remote component focuses on services related to provisioning, deployment, and hosting applications (e.g. Drupal). It manages the underlying infrastructure and resources required to run applications in various environments, such as development, staging, and production. By maintaining a clear distinction between Core and Remote components, Lagoon can provide a streamlined and efficient platform for managing containerized applications, ensuring that each component can evolve and scale independently.
Kubernetes is provided to the CMS team and administered by VFS-Platform through AWS Elastic Kubernetes Service (EKS). The AWS network that EKS uses is segmented into 4 parts: Utility, Dev, Staging, and Prod. Utility can communicate directly with the others, but the Dev, Staging, and Prod segments are isolated from one other for security reasons.
Because of the above security consideration and best practice recommendation from Lagoon our implementation will consist of:
Each instance of the CMS hosted on a Remote will connect to AWS services outside the EKS cluster:
This leverages AWS-provided management and redundancy rather than attempting to replicate that same functionality within Lagoon.
Each EKS Cluster is connected to internet through VA's Trusted Internet Connection (TIC) which is a set of network security and boundary devices that protects the VA internal network. This is of important note because of heavy restrictions on the ports and protocols applied to bidirectional communication with external dependencies. Additionally, there is significant added latency and reduced bandwidth due to network inspection across the Open Systems Interconnection (OSI) layers.
The end result is that application builds can take an unacceptable amount of time, especially considering that most of the traffic transiting the TIC will not differ between individual builds. We expect to solve this problem using Nexus Repository Manager. Nexus caches packages and other software dependencies between builds, minimizing the performance impact of the TIC.
Pygmy is a Docker-based Drupal Development environment that simplifies local development environments for web applications. Pygmy can be used in conjunction with Lagoon to provide a local development environment that closely matches the production environment in Lagoon.
Pygmy depends on the existence of the below files. Of important note these files with the addition of .lagoon.yml
is exactly what a Lagoon Remote will deploy to run an application.
docker-compose.yml
.lagoon/cli.dockerfile
.lagoon/nginx.dockerfile
.lagoon/php.dockerfile
Pygmy and Lagoon both use Docker Compose, a multi-container definition of an application, to deploy and run an application. This makes Pygmy a powerful tool for verifying an application will run as expected in production; local container images, package versions, services, and configuration exactly match those deployed to production. If it runs locally, it will run on production.
Lagoon will have no impact for CMS Engineers on WIndows, Linux, or Mac (whether Intel or ARM), unless we want it to. The usage of Lagoon does not obviate the use of existing local development tools, i.e. DDEV.
It is actually preferable for most engineers to continue using DDEV for local development rather than Pygmy; outside of Pygmy's effort to enforce parity between local development and the production environment, there is no meaningful benefit to using Pygmy, and it would require a substantial migration process and retraining the team.
The deploy process will be very much like the current Build Release and Deploy pipeline. There will be major differences in that Jenkins no longer executes tasks and those tasks are no longer defined in Ansible.
Instead Lagoon Core will receive Webhook requests, then build and deployment tasks will be carried out by Lagoon Remotes.
The overall deployment process is broken into two parts:
main
branch are deployed and tested on the Staging Lagoon Remote.main
.main
in order to load the .lagoon.yml and docker-compose.yml files (Lagoon still needs these in order to fully work).Staging
environment and uses them (instead of building Images or tagging them from upstream).Disclaimer: There currently is no Lagoon functionality that mimics or replicates our current Production Deployment system.
Generally all of the Jobs or Tasks handled by Jenkins will need to be replicated in Lagoon. Many of the tasks may be covered by what Lagoon does by default e.g. Build and Deploy. Of particular note will be the runtime tasks listed in the issue linked below.
Each task category will need to be reviewed and a determination made on whether:
All Jenkins jobs that support Build, Deploy, and Runtime CMS tasks.
This section will describe how we access various components of the new architecture.
The Lagoon Dashboard is the web-based interface for managing projects, environments, and deployments on the Lagoon platform. It provides an overview of the current state of our applications, as well as access to logs, metrics, and other essential information. This will likely be used mostly by DevOps engineers, but should be accessible by every member of the team.
The Lagoon CLI allows engineers to access Lagoon environments through a command-line or "shell" interface and run tasks, introspect the running environments, etc. We anticipate that this will use the SOCKS proxy that engineers are trained on and configure as part of their onboarding.
As with the current architecture, a SOCKS proxy can be used as a secure connection to Drupal environments running within Lagoon. We do not anticipate any changes to configuration or documentation.
Accessing Drupal within Lagoon environments from within the VA network, e.g. GFE with a VPN, should similarly remain unchanged.
Other services, e.g. Keycloak and Harbor, are primarily only of interest to DevOps engineers. To the extent that they require any sort of direct human interaction, this will likely be accomplished via Kubernetes administration tools like Lens, EKS management, or other official tools. We do not anticipate a need for training, documentation, or discovery on these aspects of access.
Below describes a rough roll-out plan for Lagoon. Ideally this should start with migrating CMS-Test infrastructure to Lagoon from BRD. This will serve as an excellent testbed for the eventual CMS Prod infrastructure to Lagoon. Lesson learned during this functional exercise will contribute profoundly. Lastly, many of these tasks are captured in the overall Lagoon implementation Github Project and listed below.
I had this for the deployment flow:
main
.main
branch.main
; that is, the most recent commit that has been successfully built and has passed the full battery of automated tests.Does that match reality/our understanding more-or-less? I didn't want to change what you had, because I'm not completely confident of my understanding.
I fluffed up some sections that seemed a little terse to me, fixed a couple typos, and fleshed out the sections on Access -- I think I know what you meant here, so I went with it because it seemed like a boring section to write 🙂 If I was completely off the mark then feel free to delet this.
main
in order to load the .lagoon.yml and docker-compose.yml files (Lagoon still needs these in order to fully work).Staging
environment and uses them (instead of building Images or tagging them from upstream).Disclaimer: There currently is no Lagoon functionality that mimics or replicates or current Production Deployment system.
Note: For this deployment pipeline design to work as intended Lagoon must be extended. The below listed issues must be resolved before this is possible:
Description
Ticket #13111 is the architecture diagram. The architecture documentation will be informed by the architecture diagram and by previous conversations we've had with Amazee.io about how we currently architect our projects around the existing network architecture. Additionally, this will be informed by Lagoon's limitations and verbal agreements to extend Lagoon beyond those limitations.
Team
Please check the team(s) that will do this work.
CMS Team
Public Websites
Facilities
User support