Feedback - Githubissues

FragLegs commented 2 years ago

Hi! Daniel Toben reached out to me to get some feedback on the code behind the littermap website. It's hard to know exactly what to suggest without a deeper understanding of what the eventual goals are, but there are a few things that I can point out that might be helpful. Feel free to ignore any of it though - it's just one guy's opinions. :)

First the good stuff:

This is an amazing amount of work to get done over the course of ~4 months (assuming a start date from the first commit)
The README documentation is quite thorough. I appreciated the detailed info about how to manage and deploy the code, as well as the links to teach devs who might not know about the services being used.
++ for a serverless architecture.

I think the biggest challenge you are going to face moving forward is that transition from building a thing in isolation to changing a thing that is actively being used. You'll have to start worrying about things like backwards compatibility, schema migrations, and not introducing bugs into production. Most of my feedback here is centered around that and how to make life easier for yourself as you go through this transition.

The README should probably spend more time orienting devs to the code and the general practices, and link out to a docs folder for more information. Imagining I was a new dev working on this, the things I would want to see in the README would be things like: how do I run this locally? how do I test the code? how do the pieces fit together logically? Later on I would want to know how to deploy it, and might come back to the README for that, but wouldn't be put off by having to follow a link to docs/deploy.md.
Speaking of, it should be easy to run the site locally. Perhaps it already is. If that's the case, see comment #1 and make that info really easy to find. It looks like sam local start-api makes it easy to get the API up and running. It would be nice to have a single command that fired up the API, started postgres and dynamo, and started a node server for the front end. And a developer should ideally not have to install a bunch of dependencies to do this (maybe use docker compose). Check out localstack for a way to easily run AWS services locally.
I think you might find some of #2 easier if your frontend code was in the same repo as your backend code. Think of how you would go about running a new branch of the frontend code against a new branch of the backend code to see if some new feature works.
I couldn't find any tests in the code. Maybe they are there and I missed them - if so that info should be moved up into the README too. You'll want unit, integration, and end-to-end tests to ensure that nothing breaks as you add new code. And you'll want those tests to be automated. I like to use the GitHub flow model for development. Generally I lock down the main branch so you can't push directly to it - you can only merge via pull request. And then I set up GitHub Actions so that all of my tests have to pass before I can merge code.
Deployment should probably be easier. Once tests are created and coverage is good, you could consider automating the deployment when code merges to main as long as tests pass. AWS CodePipeline makes this pretty easy, especially if the rest of your infrastructure is in AWS. What you want to avoid is having to do a multi-step deployment process by hand when you've just fixed some bug that showed up in production.
It would be really helpful to have some documentation/diagrams that show how all of these Lambda functions fit together with the Dynamo and RDS backends. Take each of your user stories and show how the data flows through these functions and databases to achieve the desired outcome. I'm a big fan of LucidChart for drawing up those kinds of diagrams.
What are your plans for handling schema migration? Dynamo is theoretically schemaless, but that just means you need to handle it in the client code. Postgres definitely has schemas. There will need to be a way for those things to change without having to lose all of the data you've collected in the databases.
You'll probably want some kind of database backups. AWS makes this easy. Maybe you've already taken care of it.
Do you have a way to version your Lambda functions? How do you tell your app which version to use? If you are currently relying on latest consider using a specific tag tied to the date or the commit hash (or both) so that it is easy to roll back broken changes that make it to production.

Once you've gotten a good handle on change management, you can start thinking about scaling-related things. But you've got time before you need to worry about that and it's probably best to wait and see whether traffic picks up.

specious commented 2 years ago

I'd like to express my genuine appreciation to you for taking a look at this code and sharing your expertise. All of the points you've made are coming from a place of experience and are adding extremely valuable input to this project, particularly in the context of launching a successful online platform and having a great product to work on.

Your comments about automating everything are on point. Ditto for having reliable test coverage and quality internal tooling and materials (clear documentation, testing environments, best practices guidelines). All of these things must be improved further to be excellent.

I didn't even realize that github flow is officially a thing. Definitely an introduction to an effective collaboration workflow, and as more developers join the team and the product moves closer to being deployed in production, the importance of having an intentionally chosen clear-cut development flow becomes critical.

I'd like to say that I've considered the idea of having the web frontend and the backend in the same repository, for precisely the reason you mention. However, I currently think the advantages of a monorepo setup aren't quite high enough in this case to warrant the trade-offs.

The lambdas are definitely not versioned yet, but this will become important as the project becomes ready for production. Further down the same road will be convenient access to A/B testing and gradual feature deployment.

Schema migrations will definitely need to be handled.

Being able to set up an entirely local testing environment would make it quicker to iterate during development.

Another top concern is security. Scrutinizing the code and intentionally testing scenarios on top of generally good practices. There is no substitute for doing one's homework on this and being ready to an unexpected surprise. It is also no fun sending out an email to all your users saying you didn't have things secured and now their personal information has been compromised and they should change their password. There's really no limit on how seriously security ought to be taken.

Other things this project is definitely still missing are metrics and other administrative tools.

And you're completely right about scaling. That is the idea, to architect it in a way that can be scaled.

And thanks for suggesting specific tools.

specious commented 2 years ago

My personal opinion is that having the basics covered really well goes a long way toward making the development of the more advanced features straightforward.

Having good in-house materials, processes, culture and knowledge is a key asset from which everything else grows naturally.

specious commented 2 years ago

I think it would be fair to say that at the current stage it would be highly beneficial to have perceptive developers with the right combination of aptitude and experience contribute to this project so that it can be developed to where less experienced developers could more easily orient themselves.

specious commented 2 years ago

I just created an open-ended discussion topic about best practices: #3

littermap / littermap-aws-backend

Feedback #1