mattermost / mattermost

Mattermost is an open source platform for secure collaboration across the entire software development lifecycle..
https://mattermost.com
Other
30.66k stars 7.34k forks source link

[Proposal] GraphQL in Mattermost #18272

Open zefhemel opened 3 years ago

zefhemel commented 3 years ago

This is an early-stage proposal. It does not yet go very much into detail on specifics yet. Its purpose is to gather initial feedback on the approach before investing into it more deeply.

Problem

Loading a channel on Mattermost (page load), e.g. the bugs channel in the webapp results in:

Switching between channels uses an additional 18 API calls.

There is a degree of parallelization of these requests, and not every RESTful call results in initiating a new TCP connection (due to connection reuse). Still, even on a fast (home or office) internet connection it is not uncommon that many seconds are spent on performing these requests.

request-parallelization

While this may still be acceptable on the desktop, on mobile this becomes much more of a performance bottleneck.

On mobile, network connections...

Analyzing the API calls that our clients make (analysis based on the webapp, but the mobile client uses largely uses the same APIs) we notice a significant number of calls to REST endpoints:

These are used to augment data from other API calls (usually to fill caches). In addition, likely a lot of the data fetched is not actually used (a problem called overfetching).

Why this matters now

We are hard at work delivering v2 of our mobile client. v2 addresses a lot of (client-side) performance bottlenecks. As we eliminate these performance issues, a new performance bottleneck is emerging: the network overhead. Even when running a mobile client locally communicating with a locally running Mattermost instance, the number of API calls that need to be performed slow the experience down significantly.

While we don’t expect to solve this problem before launching v2, we would like to work towards already.

Potential solutions

The problem presented is not unique to Mattermost in any way, many complex applications eventually run into it.

There are roughly two solutions:

  1. Introduce additional flags (to control more precisely what data is returned to avoid overfetching but also performing additional lookup API calls), or client-specific endpoints that aggregate multiple API calls into one (e.g. a hypothetical “first load” endpoint that would aggregate all data usually fetched in these 150 requests and expose it as single call). This approach has a few problems:
    1. You end up with a lot of ad-hoc endpoints that need to be maintained.
    2. While webapp and the server are released in tandem, the mobile client is on a separate release cycle and we cannot force synchronous client upgrades. Therefore, backwards compatibility is a concern. Newer mobile clients have to be able to work with older servers and vice versa, making it hard to rely on the existence of new endpoints.
  2. Use a solution like GraphQL which was specifically designed to solve the problem of reducing the number of requests and overfetching.

Proposal

We propose to add a GraphQL endpoint to the Mattermost server. In effect this would mean adding a single additional endpoint (e.g. /api/GraphQL) through which data can be retrieved based on the client’s specific needs (this could be the mobile client, webapp or any other client in the future).

To limit the scope of an MVP of this project, we suggest to introduce GraphQL in phases:

  1. Initially exposing data that would result in the biggest gains in terms of reduction in number of requests and data size for mobile specifically.
  2. Adapt the mobile app to use this new GraphQL endpoint to fetch some of its data when communicating with a server that runs a version that exposes the GraphQL endpoint.
  3. Exposing all remaining data currently available via REST via GraphQL as well.
  4. Transition more of the mobile app to fetch data via GraphQL.

And if this is all successful, potentially:

  1. Transitioning the webapp to use GraphQL.
  2. Use GraphQL mutations to e.g. create, edit, delete posts, reactions etc.
  3. Using GraphQL subscriptions for streaming updates

Challenges & Costs

As to any project there is opportunity cost: time spent on building and maintaining GraphQL support could have been spent on other things.

In addition, these are some other things to consider:

Backwards compatibility

While we would not be deprecating the REST APIs, we will have to deal with the fact that the mobile client for some time (likely a year or longer) will have to communicate with Mattermost servers that may not yet have a GraphQL endpoint.

There are multiple ways to deal with this:

  1. An abstraction layer in the mobile client that based on the server version either sends GraphQL queries or multiple REST calls
  2. In the Apollo GraphQL ecosystem there are ways to translate GraphQL queries to REST calls, it may be technically feasible to host this in the mobile app itself, locally translating GraphQL queries to REST calls in case of older servers. To be investigated.

Maintenance of multiple APIs

Each new feature will now have to be exposed via REST as well as GraphQL, increasing development and maintenance effort.

Expected benefits

Performance improvement

TODO: Create a initial GraphQL schema, analyze the requests the mobile client performs on initial load and how those could be translated to one or multiple GraphQL queries.

Technical implementation

Implementation of the GraphQL endpoint could be done in at least 3 ways:

  1. As a Mattermost plug-in
  2. As an Appollo server wrapper around existing APIs
  3. In mattermost-server directly

Plug-in

In this approach we’d develop GraphQL as a plug-in, likely using a library like GraphQL-Go to expose a GraphQL endpoint and mapping it to internal API calls.

Pros:

  1. No changes required to Mattermost server
  2. Can be iterated on separately from the main server
  3. No “commitment” to supporting this long term in case it doesn’t work out.

Cons:

  1. Eventually we will want to bundle this with mattermost-server and migrating it into the main server may be a lot of effort?
  2. Migrating this into mattermost-server in time likely will also mean we have to change with a breaking change in terms of the GraphQL endpoint path
  3. Dealing with more permutations of plug-in versions and Mattermost versions (e.g. if you add a new REST endpoint to the server, you need to also add it to the GraphQL plugin and somehow make sure both are running the same version for it to work or deal with forward/backwards compatibility)

Apollo server wrapper

The approach would be to deploy Apollo Server alongside Mattermost. Apollo server is a node.js-based product. We would expose the GraphQL endpoint separately (perhaps mapped into our regular API namespace via nginx proxying) and in the GraphQL resolvers defer calls to our existing RESTful APIs.

Pros:

  1. Development separate from main product
  2. Potentially (to be investigated) the GraphQL to REST mapping can be reused in the mobile app to deal with legacy servers that do not already support GraphQL

Cons:

  1. Complicates deployment. Previously Mattermost deployment consisted of a single Go-compiled binary, now we’d either require a node.js install to be present or build a node.js-bundled additional binary with Apollo and our GraphQL code.
  2. Adds node.js to our back-end stack, whereas before we standardized on everything in Go.

In mattermost-server directly

In this approach we’d add GraphQL directly into our mattermost-server API, likely using a library like GraphQL-Go to expose a GraphQL endpoint and mapping it to internal API calls.

Pros:

  1. If our GraphQL plan succeeds this is likely our ultimate target state. So we’d get there straight away.
  2. As we add new features to our RESTful API, we can immediately expose them via GraphQL in one commit/PR.

Cons: ?

Discussion

We have a dedicated ~GraphQL Discussion channel on the mattermost community server.

michelengelen commented 3 years ago

I am all-in on this proposal, as I have worked with massive API structures before (for those living in or near germany: the biggest one might be OTTO) which have been transferred to a GraphQL structure. I have seen projects with single and multiple GraphQL endpoints, but as I am not the strongest in back-end technologies and especially performance considerations the actual benefits and downcomes of each are pretty unknown to me.

I'd like to add one item to consider as well: not even the development costs for the server part will be increased, but also (at least initially) the ones for the front-end development, since GraphQL is a whole lot different to what most are used to.

It goes without saying, that, if done right, the performance can be vastly improved by using Apollo (it has its own state management and the overall opinion is, that it can easily replace redux as well - that might be out of scope though) :P

So, I would gladly volunteer to be a active member of this discussion.

enahum commented 3 years ago

I do agree with the Proposal and what @michelengelen mention about the frontend, although when it comes to the front end there are many alternatives, not only Apollo and each front end client can then decide what is the best option for them if this moves forward, some will use Apollo and use it's caching and state management, other will use a library to only perform the GraphQL request and not really use caching or state management (keeping the current state management being Redux / MobX / a local database or whatever).

Yes, this will translate in a ton of work either side, back and front ends but is my believe that will gain a lot from it too, at the very least we should try and build a POC with very limited capabilities and then compare it with what we currently have and then decide if is worth the effort. Also other products like Boards and Playbooks can benefit from this specially cause they are in an "early" stage.

streamer45 commented 3 years ago

Let's start by saying that I really appreciate this proposal and also think that having it here on Github is a great idea.

I am not going to focus on technical details now but instead I'd like to offer a couple of points that I think we should take into account before making a decision:

Overall, a big +1 on the proof-of-concept, fail-fast approach which is going to be key as I know such endeavour could so easily become a time sink.

zefhemel commented 3 years ago

As mentioned by @aaronrothschild here one thing we should look into is the permission model and how that's exposed in GraphQL.

almereyda commented 11 months ago

@amyblais Could you leave us subscribers a comment in which way this was closed?

Questions I'm asking myself right now: