graphile / crystal

🔮 Graphile's Crystal Monorepo; home to Grafast, PostGraphile, pg-introspection, pg-sql2 and much more!
https://graphile.org/
Other
12.55k stars 569 forks source link

Adding subscriptions #92

Closed calebmer closed 5 years ago

calebmer commented 8 years ago

In #87 I laid out a timeline for PostGraphQL, but I forgot to add subscriptions to that timeline! Once we’re done writing lots of documentation for PostGraphQL (step 2 in #87), let’s implement subscriptions to make PostGraphQL a viable contender for realtime API needs. I want to open this issue now because I’m not entirely sure how this would work, and I want help designing the feature. I know GraphQL JS has support for subscriptions, and PostgreSQL has a LISTEN/NOTIFY command for building simple pubsub systems. I’m just not sure how to wire them up.

This is especially challenging because (as I understand it) LISTEN/NOTIFY is not statically typed, so this means events are:

  1. Not discoverable.
  2. Not easily convertible into a statically typed GraphQL schema.

To make things even more tricky, in the default configuration NOTIFY payloads must be shorter than 8000 bytes.


Here are my preliminary thoughts on implementation, but let’s discuss them!

I’ve talked to people before who’ve implemented subscriptions in their GraphQL API and they said that whenever their server would get an update they would rerun the full GraphQL query. Lee Byron also said Facebook did this in the Reactiflux Q&A. The quote is:

the full subscription query is run every time the underlying pubsub system gets triggered

So we’ll do that. This means the 8000 bytes can be a Relay ID or something similar.

PostGraphQL would LISTEN to the postgraphql channel, where we’d expect information about types and a table primary key from NOTIFYs.

But we still need somewhere to define what kinds of subscriptions PostGraphQL knows about on startup time, so do we define these in a JSON file? In CLI arguments? Is there a way to add metadata to the PostgreSQL database itself? Do we create a postgraphql_subscriptions table and expect users to register subscriptions there? Or would a table like that be completely internal?

Thoughts?

magom001 commented 6 years ago

Hi Benjie!

Have you decided what mechanisms to use? Would it be postgres triggers?

benjie commented 6 years ago

Probably Logical Decoding, but first I need to find out how well supported it is on the hosting environments the users use. If you want to contribute your environment, please fill out this survey:

https://benjie10.typeform.com/to/ALL9df

benjie commented 6 years ago

(I also plan to support LISTEN/NOTIFY which you can fire through triggers; but that will be less customised)

magom001 commented 6 years ago

Hi Benjie! Just started digging into logical decoding. I think LISTEN/NOTIFY would be a good place to start for a basic functionality. I am writing a warehouse management app and I basically need to notify clients to refresh the stock when somebody inserts data into the db. So the algorithm is quite simple: data inserted -> trigger -> notify -> listener triggers refetch on the client side. As far as I know nodejs can already listen to pg notifications (http://bjorngylling.com/2011-04-13/postgres-listen-notify-with-node-js.html). I use react-apollo on my client, so a websocket server must be implemented on the server side?

benjie commented 6 years ago

Yep; listen/notify will be the first thing I implement but it'll effectively just have query on the payload:

subscription {
  listen(channel: 'notify_channel') {
    query {
      allFoos { nodes { id title } }
    }
  }
}

(If anyone can think of a better name for listen/channel above, please let me know. Maybe something more like genericSubscribe?)

But we want the CrUD events to also trigger custom events, such as:

subscription {
  stockItemCreated {
    createdStockItem {
      id
      category
    }
  }
}

subscription {
  stockItemDeleted {
    deletedStockItemId
  }
}

These would probably also accept filters so you can be informed only when specific criteria are matched (e.g. you may only care if one of the items in your local store is deleted). We also need to ensure that the items are visible to you otherwise there's little point telling you that a stock item was created but you're not allowed to see it. This is where the complexity lies.

jmealo commented 6 years ago

@benjie: @chadfurman: Sorry for dropping off this for so long. I finally dusted it off. Since my aspirations are pretty lofty I'd like to bow out on this ticket and offer you my vision for using PostgreSQL as an application server (zero coding, use the database, get instant pub/sub, work queue, rest and graphql with edge-based template binding with the option of static site or redis cache population on database changes as well as incoming and outgoing webhooks processed inside of nginx): https://github.com/JarvusInnovations/lapidus/issues/4#issuecomment-345539467

I'm going to try to break out the logical decoding slot management of Lapidus into a node module or possibly a shell wrapper, that might be useful, I'll ping you for feedback if I go that route (I may expose it as a rest api directly from OpenResty and then do a cli-wrapper around that).

Since all of that sounds pretty... ambitious, I'd like to point out I'm just gluing together fairly robust OSS and hopefully providing cli/rest/gui/documentation/examples as time allows.

benjie commented 6 years ago

Exciting! If you don't already have a PostgreSQL work queue, here's one I've been working on with a simple node worker (idea is you can implement the worker in any backend, or even multiple backends that each work different job types):

https://gist.github.com/benjie/839740697f5a1c46ee8da98a1efac218

Good luck with Topsy!

HariSeldon23 commented 6 years ago

Not sure if this of any interest to anyone as it's outside the scope of Postgraphql, but there is a very interesting project at https://github.com/DxCx/graphql-rxjs whereby DxCx is getting RXJS and GraphQL to work together to try and achieve Live Queries. Pretty cool project

benjie commented 6 years ago

Subscriptions

benjie commented 6 years ago

If you, or your company, would like to see subscriptions sooner rather than later then please consider supporting my work via Patreon:

https://www.patreon.com/benjie

(If your company requires some kind of invoice/contract/concrete value exchange then please get in touch and we'll see what we can figure out!)

benjie commented 6 years ago

🙏 Please support my Patreon: https://www.patreon.com/benjie - huge thanks to those of you already supporting it, and a gigantically humongous thanks to one very generous supporter in particular! 🙏

💼 If any companies want to support this work, please get in touch! There's one company already interested - if anyone else wants to join in that will help share the financial load! 💼


Ignoring the CRUD (Logical Decoding) subscriptions for now, I've come up with this idea for how generic LISTEN/NOTIFY subscriptions might work:

GraphQL

The GraphQL interface will need to support generic notifications, so it can't have specific types. We can obviously make query available on subscriptions, but that's not always that useful. However! We can also make the Node interface available on a subscription, and that is useful. We also need a way of filtering the subscriptions so we're not dealing with the fire hose. I propose the following GraphQL schema (approximation):

type ListenFilter {
  key: String!
  value: String
  inValues: [String!]
}

type ListenPayload {
  query: Query!
  node: Node
}

type Subscription {
  listen(
    topic: String!
    filters: [ListenFilter!]
  ): ListenPayload
}

schema {
  query: Query # standard PostGraphile schema
  mutation: Mutation # standard PostGraphile schema
  subscription: Subscription
}

So you might subscribe to a post being created event with something like:

subscribe {
  listen(topic: "postCreated", filters: [{key: "targetBlogId", value: "7"}]) {
    node {
      nodeId
      ... on Post {
        id
        title
        body
        authorByAuthorId {
          id
          name
        }
      }
    }
  }
}

PostgreSQL

So to make that work, we need to trigger the events. To do so we'll use PostgreSQL's NOTIFY command (or pg_notify function)

Channel name

I propose that the "topic" above is the PostgreSQL NOTIFY channel.

To make sure we're only firing the events that we intend, I propose that we prefix the PostgreSQL NOTIFY channel with postgraphile:.

Filters

The filters would run against the PostgreSQL NOTIFY payload which should be a simple JSON string-string object. (I would have gone with a query string, but these are more awkward to construct in PostgreSQL and harder to digest consistently.)

The keys that you specify on the payload are up to you. When performing the comparisons, we'll stringify each value in Node.js-land so it should be safe to pass 'small' integers/floats, but passing arrays or objects as values would be bad. Officially we only support string-string.

If the payload cannot be parsed as JSON then it will be treated as if it were {}; i.e. any filters would fail.

Filters will only be ran against keys that do not begin or end with a double underscore so that we can reserve __*__ keys for internal usage, such as...

Node identification

We also need a way of specifying which node it is. In PostGraphile nodeIDs are quite simple: it's just the base64 encoded array comprised of the table name and the primary key values; for example this NodeID: WyJ0ZWFtX21lbWJlcnMiLCIxYjE3ODY3Yy03ODM5LTQ4OWQtOWY2Ni1jYmJkMWYyYjQyYjUiLCI4ZGZlMjY3Ni0xNTQ4LTRiM2UtOTk1YS1iYjQ3NzUzNzNkZGMiXQ==

is just a base64 encoding of: ["team_members","1b17867c-7839-489d-9f66-cbbd1f2b42b5","8dfe2676-1548-4b3e-995a-bb4775373ddc"]

Which is the table team_members and the two primary key values team_uuid and member_uuid.

Now, in PostgreSQL json_build_array always puts spaces after the commas, so we can't do the Base64 encoding in PostgreSQL neatly (we could construct the JSON format ourself using string concatenation, but that's a bit icky) so we'll just pass the JSON down the wire to Node which can base64 encode it for us.

We should store this value into a key called __node__.

Example

Combining all the above you might have, for example:

CREATE FUNCTION app_private.posts__notify__create() RETURNS trigger AS $$
BEGIN
  PERFORM pg_notify('postgraphile:postCreated',
    json_build_object(
      'authorId', NEW.author_id,
      'blogId', NEW.blog_id,
      'section', NEW.section,
      '__node__', json_build_array('posts', NEW.id)
    )
  );
  RETURN NEW;
END;
$$ LANGUAGE plpgsql VOLATILE;

CREATE TRIGGER _900_notify_create AFTER INSERT ON posts FOR EACH ROW
  EXECUTE PROCEDURE app_private.posts__notify__create();

Security

Events with no node field

If no __node__ field is specified in the event payload then we will not perform any checks and will simply run the subscription payload using the subscribing users privileges. Obviously the node field will be null because no __node__ was specified, but there is a small amount of information leakage here in that you know that the event itself fired - e.g. if you listened for postCreated authorId=7 and that event didn't include a __node__ on it's payload then you'd be able to infer that author 7 created a post even though you can't necessarily see what that post is (you can also infer that an author with id 7 exists).

If this is a concern then you'd just have to make sure that your pg_notify payloads specify __node__ as in the example above.

Events with node specified

When __node__ is specified we can go one step further and check that the user doing the subscribing is allowed to view that node; e.g. we might run an SQL query with their privileges along the lines of select 1 from table_name where primary_key = $1.

By default this is an index-only scan so should be highly performant; however once RLS comes into the picture things may get a little more complex, however I still think this is a fairly good solution.

If no rows are returned by the above query then we will not trigger the GraphQL subscription payload, and thus the user will have no idea anything happened and thus no information is disclosed.

If a row is returned then we can continue to execute the rest of the subscription payload using the user's privileges and send the data down to them.

Additional protection against bad actors

For good actors I think the above is fairly robust and performant. However bad actors could try and abuse the system to perform DOS-style attacks. For example if they subscribed to postCreated with no filters then for every single post that was created the attacker would be performing an SQL query that invokes RLS. They could potentially set up a number of these in parallel (maybe even thousands) which could cause the system to be overwhelmed as every write to the database would immediately trigger thousands of SQL queries.

If this is a concern then the user could set up validation for any subscription requests (e.g. by writing a graphile-build plugin) that might enforce that certain filters are in place, or that might discard the subscription request outright if the user is not allowed to perform it. This kind of plugin is fairly easy to construct and is very flexible.

Revoking sessions/subscriptions

The above assumes that all queries will be running with the credentials that the user used when they initiated the subscription. If the user logs out (or is forcefully logged out!) but does not terminate the websocket connection then the subscriptions will continue to arrive using their old auth. I've not solved this part yet.

💼 If any companies want to support this work, please get in touch! There's one company already interested - if anyone else wants to join in that will help share the financial load! 💼

🙏 Please support my Patreon: https://www.patreon.com/benjie - huge thanks to those of you already supporting it, and a gigantically humongous thanks to one very generous supporter in particular! 🙏

benjie commented 6 years ago

For when we do CRUD / Logical Decoding subscriptions: https://hackernoon.com/the-hybrid-strategy-for-graphql-subscriptions-dd5471c45755

benjie commented 6 years ago

Potentially actively kill the websocket connections every period of time (e.g. 15 minutes or so) to force re-authentication.

benjie commented 6 years ago

__latestOnly__

I'm thinking another reserved property we can add is __latestOnly__: uniqueStringHere - this would signal to the server that if it starts getting overwhelmed with notifications that it can drop any notifications on the same channel that have the same value for __latestOnly__ and only deliver the latest one - a way of clearing through the backlog faster.

This might be useful for UPDATE clauses where you don't care about the intermediary changes - you just want to know what it is NOW - in this case the __latestOnly__ value would be something unique and persistent about the record e.g. its primary key.

This would only take effect if that key was specified and it's value was non-null; e.g. it wouldn't necessarily make sense to use for INSERT actions because each of those need to be notified to the client.

I'm not planning to build this right now, but we can add it later without breaking backwards compatibility because __*__ are reserved properties.

benjie commented 6 years ago

When __node__ is specified we can go one step further and check that the user doing the subscribing is allowed to view that node; e.g. we might run an SQL query with their privileges along the lines of select 1 from table_name where primary_key = $1.

Problem: what if the event is informing us that the node was deleted? I think in that case the event would need to come along with __nodeDeleted__: true so we skip the node check (and also skip the node fetch). This does leak information though (i.e. that the record existed, and what it's primary keys were). I think this will just have to be a caveat that we accept.

frankdugan3 commented 6 years ago

Early on in one of my projects I was using RethinkDB changefeeds for realtime updates. I eventually went back to Postgres and decided to manually trigger subscription events via GraphQL mutations. I came across an interesting article where someone actually re-implemented changefeeds in Postgres and actually improved performance. There are links to the code in the article. I figured this might be helpful, as it seems like a proven and well-tested application of some of the ideas proposed here.

While there is a difference between @live and subscriptions, I think some of the ideas in the article are useful for building change detection dependencies, which may help the overall efficiency of the subscription system.

I'll admit, the implementation details are a little over my head, but I'm excited to see where this all ends up. Great work!

xorander00 commented 6 years ago

I haven't read this entire thread, so there might be duplicate information and/or ideas that have already been shot down.

I'll add more comments as they come to me.

xorander00 commented 6 years ago

@jmealo Do you have some links that I can read up on as to the reliability guarantees of pg_listen/pg_notify? I'm having trouble finding relevant info. I'd like to look over what's already out there before going through the source lol. Thanks! :)

benjie commented 6 years ago

Probably the best place to read about them is in the docs:

if a NOTIFY is executed inside a transaction, the notify events are not delivered until and unless the transaction is committed

if a listening session receives a notification signal while it is within a transaction, the notification event will not be delivered to its connected client until just after the transaction is completed (either committed or aborted)

If the same channel name is signaled multiple times from the same transaction with identical payload strings, the database server can decide to deliver a single notification only. On the other hand, notifications with distinct payload strings will always be delivered as distinct notifications. Similarly, notifications from different transactions will never get folded into one notification. Except for dropping later instances of duplicate notifications, NOTIFY guarantees that notifications from the same transaction get delivered in the order they were sent. It is also guaranteed that messages from different transactions are delivered in the order in which the transactions committed.

Postgres only sends the notifications to connected clients, it does not cache them, so if a client loses connection it will not receive the notification.

xorander00 commented 6 years ago

Ah ok, so it's pertaining to Postgres behavior by design. I misinterpreted and thought the implication was that it sometimes failed to deliver notifications to actively connected clients.

My personal opinion is bridging between Redis or RabbitMQ from Postgres to queue up the notifications. Between the two, it comes down to durability/reliability. If notifications for subscriptions don't require delivery guarantees, then I'd probably go with Redis. Otherwise, I'd go with RabbitMQ.

If going with Redis, then nChan would be a great option (provided that delivery is reliable). One potential issue though is that nChan isn't really designed such that clients can directly PUB/SUB to messages from Redis alone (they want you to use the HTTP API). A positive to this approach though is that it's integrated into nGinx, which means that HA is easier because it's decoupled. If an upstream server, like PostGraphile, goes down, the client won't lose the WebSocket connection.

With respect to RabbitMQ, I'm actually working on a plugin that subscribes to notifications from Postgres and publishes them into a RabbitMQ exchange. I currently like this approach, but we'll see how it goes once I'm done with it and have had a chance to see the pros and cons.

I use PUB/SUB for both server-side task queues as well as client notifications. The former requires guarantees while the latter does not. I have both RabbitMQ and Redis clusters going, but the former has more flexibility and reliability for me (I don't want it to drop task events, but client events aren't mandatory), so I'm using that as the underlying service for message storage and delivery.

Anyway, food for thought.

benjie commented 6 years ago

I'm not aware of any issues with LISTEN/NOTIFY other than those pointed out in the manual text; I know in 2010 they were discussing issues that the notifications were not sent across the whole cluster (only to the DB node to which you were connected) but I've yet to determine if this is still the case in 2018 - hopefully they've solved it by now, and since the manual doesn't seem to mention it (or at least I've not found a mention yet) I'm inclined to believe this is the case until proven otherwise. (I'm planning to test this for myself at some point.)

Here's the work queue I use with PostgreSQL / PostGraphile, it uses LISTEN/NOTIFY to get near-instant notification of new jobs, SKIP LOCKED to be able to run the tasks in parallel, and seems to work very stably. I've been running it in various places for about 6 months and no issues yet. You could easily adapt this so that the worker's sole purpose is to move the tasks into a different queue system if you prefer.

https://gist.github.com/benjie/839740697f5a1c46ee8da98a1efac218

xorander00 commented 6 years ago

Yup, that mailing list thread was the only thing that popped up when I was Googling. I'm also going to talk to some of the committers tomorrow for further feedback.

I've seen your gist before and it looks pretty good :) One issue though for me is that it requires extensions, which usually can't be installed onto hosted providers. The best article I've read thus far on using Postgres as a job queue is this one, which digs into the innards of the implementation and shows benchmarks.

I'm using Celery with RabbitMQ for my task queues and Redis for result storage and caching. I'm not sure what the best approach to PostGraphile subscriptions is going to end up being, but it should generally be independent of the storage system.

benjie commented 6 years ago

Both pgcrypto and uuid-ossp are pretty standard extensions available on all Postgres hosting providers I've looked at (RDS, Heroku, Google, Azure, etc). Though come to look at it I don't think it actually needs pgcrypto, and it only uses uuid in one place to generate a unique name for the queue if one is not given by default - both extensions could be removed with a minimum of effort - thanks for the feedback!

The post you reference is from 2015, before SKIP LOCKED was introduced with PostgreSQL 9.5 which improves the situations a bit. It's still not going to be as performant as a dedicated task queue, but it's a good starting point until you need more performance.

xorander00 commented 6 years ago

@benjie SKIP LOCKED is a god send, helps reduce the complexity :)

What's the general status on subscriptions thus far? I figure a quick & current status update would be good to have. The comments on this issue are getting pretty long and some are pretty dated, so it'll save some some time.

I'm open to contributing if I can as it's something that would be useful for my own use-cases in the coming few months. I mentioned in #523 that I'm currently using STOMP for push events, but I'd love to be able to use that as a secondary option and have the primary option be GraphQL subscriptions. I'm currently looking at potentially implementing a custom PubSub provider for graphql-subscriptions.

metaheap commented 6 years ago

I'm thinking about using gun instead of socket.io to for sending messages from db to the browser and react-native. Does anyone have any experience with this?

sjmcdowall commented 6 years ago

I am a bit confused here -- from what I am reading 'gun' IS a DB. It's a graph DB to be exact .. so not sure what gun and postgraphile have to do with each other?

You would use gun or postgraphile (or even both if it makes sense depending on the data) .. but .. never gun WITH postgraphile ??

Am I missing something here?

On Jul 1, 2018, at 10:56 PM, doorway metaheap notifications@github.com wrote:

I'm thinking about using gun https://gun.eco/ instead of socket.io to for sending messages from db to the browser and react-native. Does anyone have any experience with this?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/graphile/postgraphile/issues/92#issuecomment-401658090, or mute the thread https://github.com/notifications/unsubscribe-auth/AB8M7eE1hqHNlIqJwQLlIeoiqyuNKDrbks5uCYvmgaJpZM4JfHl7.

jmealo commented 5 years ago

I believe that the latest versions of PostGraphile and PostgreSQL now have the primitives required to implement this sensibly. I'm swamped at work right now but I'm cobbling something together for a prototype, I'll share a simple solution should I find one.

benjie commented 5 years ago

Cool 👍 Chat with me on Discord about it and I’ll help you make it a server plugin so you can work on it without forking 🤘

JeetChaudhari commented 5 years ago

@benjie Is there any way I can use subscriptions with apollo-server? I am using 'postgraphile-apollo-server' with nestjs framework, it works great for queries and mutations however I am not sure how to use subscriptions. I can write my own custom subscription but I would like to do it via postgraphile plugin.

benjie commented 5 years ago

@JeetChaudhari I don't have enough experience with Apollo Server to know for sure; but have you tried creating a Subscription extension with makeExtendSchemaPlugin? Maybe it Just Works ™️?

https://www.graphile.org/postgraphile/make-extend-schema-plugin/

module.exports = makeExtendSchemaPlugin({
  typeDefs: gql`
    extend type Subscription {
      testSubscription: Int
    }
  `,

  resolvers: {
    Subscription: {
      testSubscription: {
        subscribe: () => {
          // return async iterator here
        },
        resolve: d => {
          console.log(d);
          return d;
        },
      },
    },
  },
});
xorander00 commented 5 years ago

@JeetChaudhari

My current approach is to utilize Apollo schema stitching with link composition.

I have a thin, root GraphQL server that stitches together other upstream APIs (both GraphQL & REST). One of those servers is an Apollo GraphQL server that's dedicated to subscriptions. The root server uses link composition (aka. splitting) to detect whether the incoming request is a subscription and, if so, routes it to the upstream subscription server. Regular queries & mutations go to PostGraphile (or other relevant servers).

JeetChaudhari commented 5 years ago

@benjie I tried but always I am getting the following error.

{ "error": { "message": "Subscription field must return Async Iterable. Received: undefined" } }

Maybe there is something wrong with my implementation. I will give it try with the example you provided and let you know.

@xorander00 Thank you for showing the approach. I would try to get this done through the plugin but if it would take too much time, I would try approach provided by you.

JeetChaudhari commented 5 years ago

@benjie Thank You, it worked like charm, I was referring to https://github.com/nestjs/nest/blob/master/sample/12-graphql-apollo/src/cats/cats.resolvers.ts and as per example I wasn't writing resolve method, only subscribe. Here is my test subscription plugin.

import { makeExtendSchemaPlugin, gql } from 'graphile-utils';
import { PubSub } from 'graphql-subscriptions';

const pubSub = new PubSub();
let count = 0;
function emit() {
  count++;
  console.log('emit called');
  pubSub.publish('eventName', count);
  setTimeout(() => {
    emit();
  }, 1000);
}

emit();

module.exports = makeExtendSchemaPlugin({
  typeDefs: gql`
    extend type Subscription {
      testSubscription: Int
    }
  `,

  resolvers: {
    Subscription: {
      testSubscription: {
        subscribe: () => pubSub.asyncIterator('eventName'),
        resolve: d => {
          console.log(d);
          return d;
        },
      },
    },
  },
});
benjie commented 5 years ago

Super happy it worked for you! You may want to remove the console.log, I just added that to help you debug 👍

benjie commented 5 years ago

Super excited to announce that we just released 4.4.0-alpha.0 which includes OSS subscriptions and live queries support. To install, use postgraphile@next, and otherwise follow the instructions on the website: https://www.graphile.org/postgraphile/realtime/

I'd love to hear what you think! Come chat in our Discord: http://discord.gg/graphile

katywings commented 5 years ago

Thanks a lot for your hard work 🙂

benjie commented 5 years ago

Finally closing this, one of the oldest open issues in the repo, because 4.4.0 has been out for a while and seems to meet people's needs! 🎉