WelcometoMyGarden / welcometomygarden

Web app of Welcome To My Garden, a not-for-profit network of citizens offering free camping spots in their gardens to slow travellers.
https://welcometomygarden.org
GNU Affero General Public License v3.0
107 stars 16 forks source link

Moving away from Firebase in favor of a more traditional API #106

Closed archived-m closed 3 years ago

archived-m commented 3 years ago

Hey everyone!

As a result of our recent community call, recent frustration with the deployment pipeline and limitations in both development and welcoming open source contributions, we've often discussed moving away from Firebase for this project. I'd like to have the remainder of this discussion in the open - both to welcome outside suggestions and so we can link back to this should we need to in the future.

I've listed some of the pros and cons to using Firebase below, compiled from my experience using it on this project and a handful of others. I also added a list of repercussions for us should we move, and things we should pay attention to if we were to create a more traditional API. Finally, I've included my suggestions for a new stack and the rationale behind it - but I welcome your suggestions as well.

Originally we decided to use it because we were on a clock and it lent itself well to the issues we were trying to solve. We had no idea what kind of attention the project would get, so we needed something that would scale, in absence of the skills and time required to set up a proper infrastructure. Many of these reasons have since changed, which sparked the discussion.

Note that I am very much in favour of this change, and while I tried to be objective, this list is coloured through my glasses of dread :). Google is a data hog and I don't trust them with our users', let alone mine. If I can avoid spending another dime of user donated money on helping them continue their business, I will.

Pros to Firebase

Cons to Firebase

What moving looks like for us

These are purposefully ambiguous, there are pros and cons to both sticking with and moving away from Firebase:

The later we choose to make this switch, the harder and more time-intensive it becomes. The facts won't change, and as it stands, neither will the number of contributors. As such, I'd prefer to do it ASAP or not at all.

The tech we choose may warrant a discussion on its own, and I know many have strong feelings and preferences on the direction to go in. Note that which technology either of us prefer or want has no bearing on whether or not we should switch. That's a separate issue. I will leave my suggestion and rationale below!

Thanks for reading, I look forward to seeing your proposals/thoughts :)

archived-m commented 3 years ago

As for my suggestion on what to replace it with:

I'd love to say something like Elixir + Phoenix or Absinthe, but the whole point of this discussion is to be more welcoming to new contributors :)

Using StackOverflow's annual developer survey of this year and past years, we see Typescript is the second most loved language (after Rust) and JavaScript is the most popular technology by a landslide. It also scores consistently well in the StateOfJS survey. Given that our frontend is JavaScript + soon to be TypeScript, and most of our core contributor team is familiar with it, it seems like an obvious choice as a successor.

More specifically, I'd like to use

Edit: All of these technologies are open source

Be sure to let me know your thoughts and/or objections before we make a final decision on this. Thanks!

chaixdev commented 3 years ago

hello, I'm a very new face here, take whatever you want from these comments below ;)

archived-m commented 3 years ago

@chaixdev thanks for weighing in!

I agree on all your points. As a note on authentication, we initially rolled our own email/password auth because we didn't want to force anyone into having a third party account, as you mentioned. I don't think we're at all opposed to having them available as alternative methods of authentication, it's just that when we launched, we never took the time to do both well. While we can roll out third-party auth, we still need to facilitate email/password auth for existing users.

If we're going with the proposed stack, I was planning to use Passport which is a go-to for a lot of Node APIs, is the recommended way to do it using Nest, and has most auth strategies available for you, baked in as separately installable libraries. The tricky bit is that you want your auth server separated from your resource server so you can use and secure refresh tokens, and it becomes pretty microservicey (likely not a problem if we go the docker approach though). If you have experience with Keycloak, I'd love to have a chat.

mariha commented 3 years ago

Hey all,

Very well thought @MichielLeyman! Thanks for an in depth explanations.

I fully support the idea to move away from Firebase.

First of all, I should be able to dedicate a few hours per week for WTMG, probably not more than 10h though. Below I share some of my comments.


If I happened to do it by myself, I'd take Trustroots codebase (fork the repo) and adapt TR backend to WTMG frontend (keep graphical design for sure, not sure about frontend) and then migrate data and evolve from there.

Some of the arguments to it are:

Otherwise, some of thoughts on where to go:


Hosting/deployment

If it made the move easier, Google has Cloud Functions which could be used as a step between Firebase and containerized app. Otherwise there are a few options where to deploy Docker images and how much orchestration/control over infrastructure we'd like to have. I wouldn't go that deep into infrastructure to use Kubernetes until we need it.

Interesting options (for me) would be:

Both have K8s should we need it at some point.


Architecture

I'd like to advocate for microservices.

Pros The main advantage is that each microservice runs in separate process so they can use it's own tech stack, allowing greater diversity of technologies. It would:

  1. open doors to wider range of contributors with other skillsets and interests, like me and @Chai with experiance in jvm ecosystem.
  2. allow to use best tool for the task, messaging and gardens/profiles could be backed by different database engines without increasing complexity of the features
  3. Kotlin is native language for Android, being open to devs skilled in that area (jvm) may be useful when we decide to have native mobile apps
  4. In the very long run, scalability

Cons are that the app is not that complicated to require it and it would mean more devops tasks for us.


Tech stack

I have no experience with JS but was planning to learn it, especially frontend. TypeScripts is great for me. Backend in JS would be out of my focus as for now, as I usually can find my ways with jvm ecosystem, what you propose seems very interesting though.

For database(s) I'd advocate for NoSQL for scalability and ease of use (no need for ORM).


What I can help with?

archived-m commented 3 years ago

Appreciate the feedback! Fully in favour of Docker + K8s (even in the short term imo). I've had fantastic experiences with DigitalOcean and we are using both Google Cloud Run and Cloud functions to host WTMG in its current state already, they're all options.

Same for microservices, fully on board. As it stands, I think we'd only need to separate 3-4 services but I'd rather start with that than do it later. As long as we see the back-end/API as being decoupled from our front-ends, a possible native app in Kotlin in the future should be no problem regardless of which architecture we go with, afaik.

As you mentioned, with the microservice approach there's no problem using whichever tool is best for a given feature (e.g. Mongo for geo and Postgres for auth). I will say I'd love for most services to be built using the same paradigms, again to allow for ease of contributions. Not saying they can't diverge, but I'd rather have 3 in TypeScript using NestJS (or 3 using JVM) than one in TypeScript, one in Java and one in Elixir.

Huge fan of the idea to federate and I would love for us to get in touch with Trustroots. I do think this is a separate discussion from using their code as a starting point. As they state on their development page:

Trustroots was built upon MEAN.js boilerplate (from Mongo-Express-Angular-NodeJS). MEAN isn’t active anymore and we’ve modified the codebase extensively for our own purposes, so it’s better not to rely too much on their documentation. While boilerplate was a great way to get started with rather large application, we inherited a lot of cruft and kinda complicated setup from it. As time has passed, several aspects of the application are not that modern anymore and we have lots to do to bring it up to date.

I love reuse, but I don't want to inherit the second-hand cruft they inherited from MEAN. Looks like they have moved most of their codebase to React in the meantime though.

wardbeyens commented 3 years ago

Hello,

Wow, well, I am also in favour of getting rid of firebase and in creating our own backend with new yet approachable technologies.

The technologies listed by Michiel seem very interesting to me. Especially Nest.js and GraphQL I am curious about.

WTMG also serves to learn, in that regard, I am looking forward to trying out (these) newer technologies.

I am especially looking forward to being able to write code with confidence so that nothing else breaks, both in the frontend and backend.

ludov04 commented 3 years ago

Hey all!

Thanks @MichielLeyman for the very clear write up. I also support the idea of moving away from firebase and agree with most of your points. Firebase is great to get started but it very quickly locks you up. For me the biggest issue I see is vendor lock-in, harder to do local development.

Tech Stack

As of how to move forward I'd say Node.js + Typescript is awesome if welcoming contributors is important to us. I'm a huge fan of statically typed languages in that regard as:

Regarding the backend framework itself I don't have a strong opinion.

Infrastructure / Hosting

I'd also advocate to containerize things as much as we can rather than going for another FaaS solution like Lambdas or Cloud Function. When we have containerised apps that respect the 12-factor principles, then it's very easy to switch between providers or infrastructure layers.

Microservices

I'm in favour but we have to be careful and think about the why. Microservices comes with a lot of promises and it is great in principle, but it can come with its own set of challenges and it's not always worth it. As a example, here is a article about how Istio decided to move away from microservices

If you think about why companies do microservices, it is often to:

I reckon there is probably good candidate for microservices in our system, things that come in mind, even tho I'm not familiar with all the features/codebase:

On board with microservices, we just have to be mindful that with every service comes some operational overhead and think tings through when we create a new one or decide to split things up. 3-4 seems reasonable for our size tho.

Database

Slight preference over Postgres vs MongoDB, I think having a data model enforced by the relational database engine makes it easier to avoid data integrity issues down the road. Relational database can handle huge amount of traffic without problem these days and there are a lot of scaling strategy we can use.

archived-m commented 3 years ago

Thanks everyone! Most people with the time and energy to contribute in the near future have offered their opinions, and it looks like we can consider the decision to move away from Firebase finished. Wonderful :)

As for what we're moving to, still undecided. From what I've gathered on here and through other channels:

Fun! I think these are the base building blocks, and if we can agree on these, we'll figure out more specific tooling and such down the road, they are also more interchangeable in case we find we want to do something else.

I'll create an architecture diagram and set up some very basic different boilerplates to see which are easiest/best/most fun, and we'll go from there?

Go team!

mariha commented 3 years ago

So now I'm gonna take a step back... 😎

I will say I'd love for most services to be built using the same paradigms, again to allow for ease of contributions. Not saying they can't diverge, but I'd rather have 3 in TypeScript using NestJS (or 3 using JVM) than one in TypeScript, one in Java and one in Elixir.

Not sure if I agree... I think there is more reason, in our case, to use separate services because of the flexibility of tech stack it allows - which comes with opportunities to contribute for people from broader range of backgrounds - then because of the scalability it gives. We just don't need it yet and as others said, it comes with added complexity. So if not for that flexibility, I'd rather start with simple, single web service. We will try to keep it well modularized and keep in mind that at some point we may want to divide it into separate services and distribute over many execution instances. I wouldn't do the move until we are mature enough though... And once we have all the safety net tools and practices in place (good tests, CI/CD, monitoring, ...) we could go distributed. And then, there is no reason why we had to stick to single tech stack for all services πŸ˜‰ Is there?

Hope it makes sense. Do you agree?

The good thing is that nestjs seems to make it really easy to move from single web app to a few microservices. And reading some code snippets, it doesn't look that different from what I'm used to so it may be easier for me to contribute then I initially thought.

Go team! πŸ˜‰

chaixdev commented 3 years ago

Great to see all the contributions.

containers

Containers, yup, cloud functions not so much. One of the reasons stated to roll our own backend is to avoid vendor lock-in (and deservedly so) with containers, you can pick up and move relatively easily, it seems to me that with cloud functions you're once again stuck with APIs specific to the cloud functions platform provider. I had not yet heard of Knative, looks interesting.

dev <-> ops?

In my view, microservices make developing easier, and operations harder. From the perspective of engaging more contributors, this could actually be a big advantage: keeping developing (and code reviews for merge/pull requests) lighter, while keeping operations in the hands of key contributors.

microservices

I'd like to point out that WTMG is already partly 'micro serviced' in that the tile server is separately hosted. Likewise, I think authentication is a prime target. moving the user authentication and session stuff out of the core logic would already allow scaling the backend service horizontally, even before separating other parts that @MichielLeyman identified. Then, as the need arises, and we find that certain parts would benefit from being scaled separately, we can work on extracting them.

tech stack

I agree with the preference for a Typed language. I've used Typescript before (in Angular, back when I still pretended to be a full-stack dev :grin: ) and look back fondly (no .js for me brrrr :cold_face:)
I do disagree with @mariha on separate tech stacks though. in my opinion, this would create barriers for contributors that are already active in one area to also contribute in another. I guess what we're weighing is: potentially attracting diverse contributors vs facilitating contributors that are already active and committed. Maybe it comes down to who would be the maintainer of the specific microservice, or perhaps we need a discussion every time we think we want a new microservice for a certain goal.

Certainly, the right tool for the right job applies. If there's already an open-source product that covers our needs, I am much in favour (like Prometheus for monitoring or authentication with Keycloak).

mariha commented 3 years ago

Don't want to seem very picky, but I tend to have opinions and usually express them. I hope you'll get used to it ;)

In my view, microservices make developing easier, and operations harder.

true. Here is a list of things to keep in mind: "You need to be this tall to use [micro] services".

dev <-> ops?

Please, no. Let's do developing as light as possible while keeping the feedback loop closed and have everyone involved in the feature they develop from the beginning to the end and over again. It's beneficial for everyone: devs who can learn on their mistakes and are motivated to prevent them, and ops who would otherwise be downstream devs, dealing with issues someone else made. Users who are going to see fewer issues, hopefully 🀞.

moving the user authentication and session stuff out of the core logic would already allow scaling the backend service horizontally

Another option is to keeping session on the client side and use JWT for authentication.

Then, as the need arises, and we find that certain parts would benefit from being scaled separately, we can work on extracting them.

agree, as the need arises ;) I'd be exited to move it to the next level then.

I do disagree with @mariha on separate tech stacks though. in my opinion, this would create barriers for contributors that are already active in one area to also contribute in another. I guess what we're weighing is: potentially attracting diverse contributors vs facilitating contributors that are already active and committed.

I guess I spent too much time at huuuuge codebases where everything was meant to be uniform and as a result making any change was a pain - one would have to do it in whole codebase (600K LOC) so no one did it at all, for years. Let's write the code so that everyone can easily understand it. So I'd optimize for understandability and expressiveness, whatever technology/library it takes. I don't think we have to agree on it (technical uniformity over diversity) right now though, until we get to the point where we want to separate something.

Let's continue discussing in Slack, if you'd like, it's easier than here.

Everything else we are on the same page πŸ˜‰

auloin commented 3 years ago

Hi! The discussion is well advanced and you've already covered all major points. Here are my thoughts:

The move

I fully support the move. It hasn't always been so, simply because building a backend takes time (glad to see that all participants are ready to invest some time to make it happen πŸ˜„).

To me, Firebase feels just too well engineered to charge users as much as possible.

On the security side, it's to be noted that there is no rate limiting feature, so a single user can ruin it for everyone.

Because we can't compromise on security, those trying to optimize reads just end up building their API... I think at some point we'll be facing the same issues.

I'd love to see in parallel, the start of a developer guideline. We have the opportunity to document and properly test the whole thing.

The stack

I'd like to see what Elixir is about, or even Kotlin. But for reasons already mentioned, the stack Michiel's proposing will do a better job, imo.

If I have to change something, it would be Typescript for vanilla JS 😸.

Joke aside, I'm fine with anything that can make the job easier.

suancarloj commented 3 years ago

Hey, I thought I would join in the conversation as suggested by @MichielLeyman on slack.

The first thing that I would like to propose is to use something like Terraform from the start, we recently put that in place at my current job and it's really great! It really helps documenting the infrastructure and adding new things is relatively easy, you can even write it in Typescript πŸ˜„

When it comes to k8s I would advice to not use it until we cannot do without, unless there is at least 3-4 ops people that can take good care of it. After using it for the past 4 years, I feel that it adds too much complexity. K8s can take a lot of time to debug if something goes wrong with the clusters if you don't know enough about it.

For Docker builds I would not mind exploring something like kaniko or buildah I tend to find docker build slow (at least on gitlab)

When it comes to micro-services I have to agree with @mariha, start simple with a single web service, specially with nestjs, as you can build good modules that you can later be moved as a new microservice. In the past year, I have read mostly about teams going back to fat services. A micro-services with bad domain boundaries will do more harm than good, and we could find ourselves with a distributed monolith πŸ˜„ . If the choice is to go the micro-service way, it would be good to make sure to pick a good way to enforce the API contracts, where I work we currently use GRPC with protobufs, this allows us to make sure that we are well aware of all breaking changes on our APIs, it also allows us to generate type for the frontends. Protobuf have a lot of drawback and can bring a lot of frustrations.

Regarding Nestjs, I think it's a good choice, as it's quite flexible, I would just suggest to keep the default setup with expressjs platform rather than trying the fastify which is technically faster, but with little documentation. Nest supports the OpenApi Spec which is nice to keep the apis well documented.

Concerning Postgres vs mongodb, I have a preference for postgres as you can have constraints in the db, and it's much easier to do analytics with SQL than with mongodb query pipelines. If mongodb is preferred, then I will just say do not use mongoose or typegoose as ODM for mongo, they are really bad for performance, and the semver is not well respected by mongoose.

In the case of Graphql vs REST, I have no preference, but I find REST to be much more accessible and there is much content available that we cannot encounter any surprise.

Looking forward to start working with all of you :)

archived-m commented 3 years ago

Thanks so much for the input everyone!

Here's the key decisions (I think) we've made, and I think we can consider final, allowing us to get started:

Migrating

The following is a list of my interpreted hard requirements before we can export production data and make the actual move. It serves as a progress checklist to being "done" with this. We're aiming for migration without regression, and everything that adds is considered out of scope for this issue.

I'll leave this list as-is for another 2 days to welcome final input, changes or concerns, and then I will separate it into individual issues, where I will ask for help and on which you can express interest/commitment should you wish to contribute :heart:. I will also expand on each point, and we can discuss their intricacies there, separately. I've created a separate milestone (called v2 - Community) that we can use to track progress, and I will close this issue when the individual issues have been created.

Deployment of the tile server is a separate, non-blocking issue and may not be an issue if we can partner with Mapbox.

I will take full responsibility for front-end related tasks together with @wardbeyens and @auloin, given its open PRs and general lack of contributor guidelines at the moment. If you wish to help, please respond to the individual issues for these tasks (see end of post).

Nice to have

Here are some open issues to keep in mind while developing replacement endpoints, that could be a quick fix, but are not required before migrating to this "new stack"

I will kick us off with a starter skeleton and directory restructure in a draft pull request to develop. Go team!

ludov04 commented 3 years ago

Thanks @MichielLeyman for this awesome write up.

Knative makes it easy to start with Cloud Run and later move to Cloud Run for Anthos or start in your own Kubernetes cluster and migrate to Cloud Run in the future. By using Knative as the underlying platform, you can move your workloads freely across platforms, while significantly reducing the switching costs.

archived-m commented 3 years ago

The milestone (track progress here), v2 branch (all v2 related code and changes go here), and individual issues are ready.

For any additional input, please comment on the corresponding issues.

Thanks for responding everyone, it's been super helpful!