rmosolgo / graphql-ruby

Ruby implementation of GraphQL
http://graphql-ruby.org
MIT License
5.38k stars 1.39k forks source link

[Warden] Types referenced from invisible fields still show up. #1333

Closed eapache closed 4 years ago

eapache commented 6 years ago

@rmosolgo @xuorig

Given a simple schema like this:

type A {
  b: B
}

enum B {
  VALUE
}

If the type A is hidden then we would expect B to be hidden as well, but this is not the case. This is because when checking whether B is referenced, we only look to the immediate visibility setting of the field b rather than doing the full check as to whether b itself is implicitly hidden (which it is, because it's on A which is hidden).

Instead of checking visible? in https://github.com/rmosolgo/graphql-ruby/blob/eff0a6c952734225bc899ec712d20c5a4e1b9be9/lib/graphql/schema/warden.rb#L152-L155 I believe we can just check visible_field? instead which does the right thing?

Caveats to that suggestion:

rmosolgo commented 6 years ago

Yeah, I agree it should work the way you describe! Thanks for those details.

eapache commented 6 years ago

I took a crack at fixing this and I think I missed something key in my initial analysis: visible_field?(b) doesn't check the visibility of A, it checks the visibility of B. In fact, I don't see a way to get from the GraphQL::Field for b to the type A in the first place; is this even possible?

eapache commented 6 years ago

GraphQL::Schema::Field has an owner which would work, is updating the warden to use the class-based classes a possibility?

eapache commented 6 years ago

This problem effectively devolves into https://en.wikipedia.org/wiki/Reachability from the root for a cyclic digraph. Eagerly walking from the root would work but would require preprocessing the entire schema on every new warden (aka every request). Even lazy path-climbing back to the root (assuming we can get from a field to its owner) isn't going to be very efficient depending on the schema.

I appreciate the current warden implementation as in fact a very good sort of "best-effort" to do as much reference-checking as possible without actually doing any recursion, however I'm starting to wonder if it's more effort than it's worth. Now that I've grokked the problem I can point out a few other potential surprises here that aren't fixable without a full reachability analysis.

I'll give it some more thought to see if there's a clever (efficient) algorithm for reachability I can construct given some of the specific constraints of the GraphQL schema graph, but if not I wonder if we shouldn't be going the other way: simplify the warden to do only a minimal, well-defined set of reference checks and document it explicitly as the schema author's problem to make sure their filter is self-consistent.

rmosolgo commented 6 years ago

It sounds like you've had an ... interesting ... afternoon πŸ˜†

has an owner

I think it would be ok to use that _when it's available, something like:

(owner_type = field.metadata[:type_class].type.unwrap) && visible?(owner_type.graphql_definition))

(with a lot of hand-waving to maintain compatibility).

Alternatively, I certainly couldn't say no to a rewrite of some kind! I wonder if it would be possible to do a shiny new take on schema filtering with class-based schemas, while keeping the old one for compatibility for a little while. If you want to open a dedicated issue or an exploratory PR, I'm definitely game.

eapache commented 6 years ago

Another schema that my brain cooked up to complicate matters:

type Query {
  a: A
  P: P
}

# First chain

type P { x: Q }
type Q { x: R }
type R { x: String }

# Second chain

type A { x: B }
type B { x: C }
type C { x: D }
type D { x: P }

If I filter out type R then the entire schema should end up empty, but that’s incredibly difficult to compute without multiple passes.

eapache commented 6 years ago

TLDR: I do not believe this problem is practically solvable. I recommend simplifying the warden to avoid surprises, at the cost of requiring schemas to be more thorough in what they filter.

edit: maybe it is after all, see https://github.com/rmosolgo/graphql-ruby/issues/1333#issuecomment-397817135

Recommendation

The goals for the warden system are (in no particular order):

The current warden falls down on the third point since it tries to solve some of the complexity of this problem (see Appendix 1) but can't solve all of it, leading to many surprising edge cases.

Given this, I believe:

If this is unacceptable for whatever reason, the most likely candidate for a "full" solution is "Precomputing distance to root" in Appendix 2.

Appendix I: Problem Definition

Summary

General notes:

To correctly determine if an element is visible, we need to check three things:

Graph(QL) Theory

Given a schema, we can transform it into a directed graph G = (V,E).

Let V (the vertices of G) be all elements of the schema: the set of all types, enum values, fields, and arguments.

Let E (the edges of G) be determined as follows:

Let r be the vertex in G representing the QueryRoot (or MutationRoot or SubscriptionRoot or...).

Let T \subset V be the set of all "terminal" vertices in the graph: the set of all scalar types and enum values.

Let f(x) be the filter function which accepts a vertex and returns true or false depending on the context of the request.

Let G' be the subgraph of G constructed by removing all vertices that are arguments (yes this is weird, its use will become apparent).

Let F(G) be the subgraph of G constructed by removing all invisible vertices, accounting for all visibility constraints (filter function, reachability, and validity).

Formalization

Given the above definitions, we can formalize our problem as: given a vertex x \in G, determine if x \in F(G).

In practice this means we need to determine that all three of the following hold:

Slightly Plainer English

For a given schema element we have to check the filter function (which is trivial and I will ignore from here on) as well as solve two symmetric reachability problems: one "backwards" (to the root) and one "forwards" (to a terminal scalar or enum value).

The backwards one is hopefully obvious; an element not reachable from the root should not show up in the schema.

The forwards one requires a bit more explanation. We've constructed the graph such that non-terminal nodes must have an outgoing edge in order to be valid:

The exception to this is fields and arguments: a field may have no arguments, and an outgoing edge to an argument is not sufficient for a field to be valid. This is why we define and operate on G' for the third requirement of our formal definition.

Appendix 2: Explored Solutions

Unfortunately for us, graphs derived from GraphQL schemas in this way are not guaranteed to have any of the following properties: planar, acyclic, biconnected, simple, symmetric. This prevents us from using a couple of existing published algorithms:

I also considered some ad-hoc schemes based on the principle that while F(G) cannot be precomputed for a given request, some knowledge can be precomputed based on G or G':

Precomputing possible paths

If we precomputed and stored the set of paths from r to x in G we could do various nice things with that data to determine if any paths were still present in F(G). Unfortunately that set of paths can take O(2^n) space for some graphs, e.g. ones looking like:

  A   D   G   J  
 / \ / \ / \ / \ 
R   C   F   I   L
 \ / \ / \ / \ /
  B   E   H   K

Precomputing distance to root

If we performed a breadth-first traversal of G, then stored the minimum distance from r for each vertex, we could efficiently find the shortest path back from x to r by using a min-queue. However, this approach would be slow for the common schema structure where a large chunk of interconnected schema is flagged as invisible by a single "bottleneck" field or type, as it would try all possible paths through the interconnected portion even though they all lead to the bottleneck.

That said, with appropriate memoization I do believe this approach is the one most likely to be practical, and it has the nice property of inverting very easily to solve the symmetric "forwards" reachability problem.

For posterity, a slightly more formal explanation.

Appendix 3: References

https://en.wikipedia.org/wiki/Reachability https://en.wikipedia.org/wiki/St-connectivity https://tel.archives-ouvertes.fr/tel-01110316/document

xuorig commented 6 years ago

Amazing analyzis @eapache πŸ‘ŒπŸ‘ŒπŸ‘Œ

Given some recent experiences with the warden, I agree with the solution you propose here.

Less magic in warden, and nore explicit checks in schema masks sounds like the way forward to me too.

eapache commented 6 years ago

TLDR: I completely missed the fact that except for introspection queries, we don't have to operate vertex-at-a-time, we can operate on the entire query at once treating it as a subgraph of G. This opens up a much more efficient (though still complex) option for the common case.

I've been unfortunately unable to stop myself from thinking about this, and I've come to the following (additional) corollary. Given a well-formed graphql query Q = (V', E') \subgraph G that does not use any introspection vertices, then (f(q) \forall q \in V') \implies (Q \subgraph F(G)). That is, given a well-formed query (with no introspection) whose nodes all individually pass the filter function, then we don't have to worry about either reachability constraint because they are implied by the query being well-formed. (This also means Q-without-arguments is a subgraph of F(G') but I'm running out of notations to formalize with.)

Since fully-visible well-formed queries with no introspection (excluding __typename which doesn't matter for this corollary) form the vast majority of common cases, and since we have to validate the query anyway, that means we can blindly check the filter function for all queried vertices and if they all pass then we're done. This actually ends up being more efficient than my proposal for the common case since fields and arguments no longer have to double-check the type they expose, it's implicit in the graph. However, it does exclude two cases:

The above isn't really a formal proof, but I believe the result is an algorithm that is sufficiently efficient under real production workloads, falling back to the slower (but still not terrible) approach only for introspection queries and queries which are already invalid.

The only question is if it's sufficiently simple to bother implementing. My original recommendation of a simplified warden may still be the better choice.

jturkel commented 4 years ago

@rmosolgo / @eapache - I took a crack at removing types from introspection results that become unreachable due to visibility checks in #2596 which seems to solve the core of this issue. Any feedback would be appreciated!

rmosolgo commented 4 years ago

Fixed by #2596

eapache commented 4 years ago

Joel's PR addressed my original example case, which turned out to be half of the total problem (see https://github.com/rmosolgo/graphql-ruby/issues/1333#issuecomment-397740253 for the other half). I'm happy to open a new condensed ticket for the remaining issue, or we should just reopen this one.

rmosolgo commented 4 years ago

a new condensed ticket

Yes, could you please? πŸ˜