Open slifty opened 4 years ago
@Laurian points out Kong Gateway as something to add to the pile.
Also mentions that we should think about the AWS API GW limits
@slifty and I have been talking about architecture and microservices, and researching options this week. Here's where I think we're at:
We're using AWS already, so we might as well use the more advanced features they offer. Set up AWS API Gateway to handle routing, DNS, and so on; build a simple authorizer function that validates a JSON Web Token (JWT); and make this upload API endpoint a lambda function. Mid-term, we would set up API Gateway to proxy the current Permanent PHP API, and migrate the front-end to use the API Gateway; mid-to-long term, we'd gradually migrate the functionality the PHP API provides into further microservices / lambda functions.
We could use serverless and/or terraform to manage automated deployments. Developers would run it using the AWS Serverless Application Model (SAM) or do more research on running locally with serverless.
Pros:
Cons:
Rather than rely on Amazon's infrastructure, we can use open source tools (such as Kong, mentioned above, or perhaps nginx / Apache as a reverse proxy) or build our own API gateway using tools like Express. As in option 1, we would set up our gateway to proxy back to the PHP API as we gradually move functionality out of it; here, our goal would be to have several standalone microservices. We'd still need to figure out authentication, but would probably have more options.
The services we build or adopt would be wrapped up into containers, and then that set of containers would be deployed as a whole. Deployment could be to any of several clouds, as they pretty much all provide container deployment solutions; in theory migrating between cloud providers would be relatively straightforward. Running locally would also be via docker-compose, although the path to running the application locally without containers is probably easier than in option 1.
Pros:
Cons:
Lambdas and microservices are good for reducing inter-team communication requirements for very large companies, and for reaching a massive level of scale; currently, we don't face either problem. Those solutions come with some drawbacks: conceptual difficulty in understanding the application as a whole, coordination between separate components, and challenges in local development & testing. Rather than go down the road to microservices, we could stay closer to the monolithic side of the spectrum.
Build out a new API. Build it in Node + TypeScript, have it proxy back to the current PHP API, and gradually migrate functionality out of the PHP codebase. Having fewer repositories would make it easier for developers to understand the application, and having fewer instances and layers of abstraction would make it easier to debug problems.
Pros:
npm run start
(or whatever we call the script)Cons:
The shortest path for immediate, urgent goals is to continue to work on the PHP API. We have a sense of the technical debt involved, and we can work on paying that debt down with the intention of keeping this codebase long-term.
Pros:
Cons:
The first three options wrap the current PHP API to continue to provide functionality, and they give us an easy way to migrate that functionality to their model of how to build software. We need to not break the application as we do this re-architecting, and this approach seems like the safest way.
They all also require changes to the front end to look for this new API, however; even for option 4, we want to split out the PHP API to its own virtual machine so as to make it easier to maintain both the API and the WordPress instance. In all these approaches, we need to host the API on a separate subdomain.
Hosting the API on a separate subdomain will have an impact on authentication, in addition to whatever constraints the selected option adds, as we'll need to worry about CORS and cookie policies and so on. This is not a concerning amount of work, I think, but it's also not zero work, and needs to be called out as part of this conversation.
IMHO it would be a move in the wrong direction to design in such a way as to tie us more tightly to any specific service provider. I.e., this is not about AWS in particular -- it just comes up most often with AWS because they're the biggest cloud services provider out there and thus, due to economies of scale, they offer the greatest number of distinct technical opportunities for binding one's infrastructure more closely to them. If Digital Ocean offered a DO-specific flavor of lambda
, I'd be reluctant to use that too.
It's better to build on the intersection of what all cloud providers provide, as much as possible. I know there is some AWS-specificity in the stack already; I'm just saying let's aim to reduce rather than increase that.
@andrewatwood, @slifty, @xmunoz, and I just had a conversation about this.
None of us like option 1; the biggest concern @xmunoz and I have is the difficulty of local development, or at least the unknowns around it. @kfogel's point is relevant here, as well. So, we won't do that.
None of us like option 4, either; we're ready to move on from that codebase.
Most of the discussion, then, was debating between options 2 and 3. It sounds like the team is quite fond of docker, and wants to use that deployment strategy; while explicitly mentioned in option 2, docker is also a valid deployment strategy for option 3. We didn't reach consensus on this, and we decided to sleep on it and discuss more on Monday.
Whatever we build, we want to be more intentional about deployments; I took that to mean separate VMs/containers for the API itself and any workers we have (such as for image transcoding), which would in turn be completely separated from the WordPress instance. Docker is a tool that could help us accomplish that.
@xmunoz rightly pointed out that we should be even more incremental than any of these options. @slifty recalled something he and I had discussed: the existing PHP API could call the new uploader service (which might be a microservice, or might be the first method implemented in the new monolith), rather than have the client call the new uploader service. That would allow us to simplify the authentication question, and defer the work of figuring out authentication on the frontend in favor of solving the more immediate problem. We all thought that was a good idea and that we should do that.
We briefly touched on monorepo vs multiple repo, and agreed to defer that question.
We also discussed the long term goal for splitting out the different components: WordPress, the API, and the JavaScript front-end. Ideally, the front-end would be deployed to S3 and served via CDN; the front-end (and our mobile app clients, and our partners) would talk to api.permanent.org
, and users would never see that domain; and WordPress would be hosted by itself, preferably not by us, and would pull in the JavaScript bundle from the CDN, similar to how it is now pulling in the bundle from the same webserver. I think we all agreed on that vision, while acknowledging that we weren't going to advance significantly towards it in the near future.
Not sure if this issue is still relevant. @slifty can you close it if not?
It still seems relevant to me, @xmunoz; we haven't made any decisions. This is a long-term planning issue that I expect the incoming director of engineering will want to consider.
Some takeaways and updates since it's been four (!) months since the main conversation on this issue:
First, expansions from the conversation above taken from my talks with the team.
My opinions on the questions raised above:
@cecilia-donnelly One minor clarification around upload service -- we didn't defer the question for upload, but rather were leaning towards the idea ("decided") that it would be a good design to have an API gateway in front of our microservices / be the outside facing "service", rather than have each service need to handle authentication. This of course had the added benefit of being a non-commital decision since it meant we weren't shaving that yak yet regardless.
Ultimately we relied on the PHP api to serve as that gateway -- though from the upload service's perspective it didn't care about the implementation of that gateway / the fact that the gateway did a lot more than authentication.
Wanted to mention this as an outside observer because it did seem like an api gateway that handles authentication would be a nice pattern to consider (and then microservices could do something for authorization like jwts)
Discussion
What do you want to talk about?
As we begin to incorporate services into the Permanent architecture, we need to make some decisions about authentication and authentication.
The decisions that come from this conversation should also reflect a balance between the short term realities / priorities and the long term architecture vision.
Relevant Resources / Research
Just for starting discussions...