netbox-community / netbox

The premier source of truth powering network automation. Open source under Apache 2. Try NetBox Cloud free: https://netboxlabs.com/free-netbox-cloud/
http://netboxlabs.com/oss/netbox/
Apache License 2.0
16.02k stars 2.57k forks source link

GraphQL support #2007

Closed cimnine closed 3 years ago

cimnine commented 6 years ago

Issue type

[X] Feature request [ ] Bug report [ ] Documentation

Environment

Description

I've talked about it a while back, but it came up in a different context in the networktocode#netbox slack channel recently.

Joey Wilhelm [20:13 Uhr] On a related note, returning related objects sounds like a case to look to GraphQL... jstretch [20:16 Uhr] what would GraphQL solve? Joey Wilhelm [20:23 Uhr] You're able to retrieve exactly what you want from an API call, and only what you want. So you can request related objects 5 levels down, or just 2 attributes off the object you're querying (bearbeitet) jstretch [20:24 Uhr] ah Joey Wilhelm [20:24 Uhr] So the API user defines what data they want in the response, rather than the API defining that jstretch [20:26 Uhr] https://github.com/graphql-python/graphene-django jstretch [20:26 Uhr] wonder how involved that is to get going

We've implemented a Netbox integration for GraphQL a while back, but we're not able to maintain it at the moment. It's called django-netbox-graphql. Right now, it's built as a django module that can be plugged into Netbox quite easily. So it should be straight-forward to get going.

Some features:

Whats the case for GraphQL?

GraphQL is a "new" way to consume from a remote system. In contrast to REST, where there is a route for every resource, GraphQL only knows one route. The request to that route contains a query in a well-defined format. In the query the requesting application specifies which resources it would like to consume and, and this is essential, which objects it would like to have nested.

A short example I would like to get all IPs of pool subnets and their corresponding MAC addresses to write them into a DHCP config file.

With REST

With GraphQL

I would have to write one query that returns exactly the fields I need. It would look like this:

{
  prefixes(is_pool: true) {
    id
    family
    prefix
    ip_addresses {
      address
      interface {
        mac_address
      }
    }
  }
}

(Note: This query does not work yet with django-netbox-graphql!)

Conclusion

I understand, that this feature is no priority for Netbox. Also, the Netbox REST API is very powerful and already nests a lot of important information. But I also think that people want Netbox to become their single-source-of-truth for all kinds of information related to devices on their network. A versatile query language, that offers full flexibility and reduces requests to Netbox might be required in such a future.

jeremystretch commented 6 years ago

I'd also like to explore this at some point, but I have to close this out as our current policy does not allow for any new major feature requests until the backlog of existing requests has been worked through.

cimnine commented 6 years ago

We've started an integration of GraphQL into Netbox a while back. It worked great for all implemented queries. Unfortunately we can't maintain it anymore, because we never got to use it. But it might still serve as reference should someone else ever see the huge benefit over static REST APIs and decide to either maintain the project in a fork or integrate it into Netbox.

https://github.com/ninech/django-netbox-graphql/

mraerino commented 5 years ago

I'd also like to explore this at some point, but I have to close this out as our current policy does not allow for any new major feature requests until the backlog of existing requests has been worked through.

@jeremystretch at what point is it possible to reopen this issue?

jeremystretch commented 3 years ago

Re-opening this as a candidate for v3.0.

ryanmerolle commented 3 years ago

I think MVP release for GraphQL should focus on READ / GET only since that is where most people have headaches with REST / post GET data manipulations.

Sure there would be some interesting write/updates using GraphQL, but I am doubtful the ROI is as high. It also allows you to iterate on the integration with a limited scope.

ziggekatten commented 3 years ago

Agree with ryanmerolle. Biggest gains are by far the GET requests.

candlerb commented 3 years ago

Sure there would be some interesting write/updates using GraphQL, but I am doubtful the ROI is as high

There are some important things which are impossible via REST at the moment, and graphQL might be able to offer a solution. In particular: add a device (or VM) + interface + primary IP address in one transaction. Currently there's a minimum of 3 creates plus 1 update to achieve this.

jeremystretch commented 3 years ago

Yes, and it makes perfect sense to keep those as discrete actions, because each is doing something very different. Each object has its own validation logic and implications. Trying to bundle them all together increases the time and memory needed to perform the request. It also greatly complicates the reporting of errors, sending webhooks, etc.

jeremystretch commented 3 years ago

It's also worth pointing out that creating ten devices with a few dozen interfaces each and an IP address on each interface via the REST API also takes four total requests.

jeremystretch commented 3 years ago

I've been experimenting with this for the past few days, and I'm happy to say I've made a surprising amount of progress in a fairly short time. The traditional approach to implementing GraphQL in a Django application requires a fair bit of overhead in defining object types, much of which is redundant to what we've already done with Django REST framework for the REST API.

To help reduce developer burden and help ensure long-term maintainability, I've oped to employ graph_wrap by @PaulGilmartin to leverage our DRF serializers to serve GraphQL requests. Paul has been extremely helpful in working through some initial bumps, and I've already got a functional PoC going.

As you're likely aware, NetBox today answers REST API requests in one of two modes: brief and normal. A brief response includes only the bare minimum representation of each obect (typically its ID, name, and a user-friendly display string), whereas the normal response includes a complete representation, exclusive of downstream related objects. For example, a rack will show its assigned site, but not associated devices within the rack.

graph_wrap, however, requires a complete serialization of each object to support the inclusion of downstream related objects. I can think of two approaches to meeting this requirement:

  1. Extend the base serializer for each object with downstream relationships, and toggle these dynamically
  2. Introduce a new "full" serializer for each object, inheriting from the current base serializer

The first approach may or may not be practical, given the intricacies of serializer initialization, however its at least worth exploring briefly. The second approach I have already confirmed as a potential solution; the only downside is needing to declare separate serializer classes. However, this may actually be more manageable in the long term, and easier to test.

There's an added benefit of either approach: it will potentially enable "full" REST API requests. For example, a request to /api/dcim/intefaces/123/?mode=full might include all assigned IP addresses. We would likely need to figure out some way to dynamically optimize the underlying queries to make this approach feasible for real-world needs, but it's a possibility.

Just wanted to get my thoughts down for now. Will provide further updates when I have more to share.

PaulGilmartin commented 3 years ago

@jeremystretch Happy to help!

If you get time, I'd be interested if you could be a bit more explicit about this statement "graph_wrap, however, requires a complete serialization of each object to support the inclusion of downstream related objects.". Perhaps you could provide an example of the kind of code you're talking about?

jeremystretch commented 3 years ago

@PaulGilmartin sure thing! I probably shouldn't have said graph_wrap, specifically, but rather GraphQL in general, requires this. Here's an example.

In NetBox, we model circuits and providers; every circuit has a provider, and a provider may have one or more circuits. Here's what our ProviderSerializer current looks like:

https://github.com/netbox-community/netbox/blob/7444110c79f0dbbe3a19047962040ebe09eb51cb/netbox/circuits/api/serializers.py#L19-L28

You'll notice that we include a count of child circuits with the provider, but not the circuits themselves. This precludes GraphQL from querying related circuits. To support GraphQL (via graph_wrap), we need to extend our serializers to incorporate child objects. In this example, we need to add something like

class ProviderSerializer:
    ...
    circuits = CircuitSerializer(many=True, read_only=True)

and update Meta.fileds accordingly. But we don't normally want to include these related objects in REST responses, because the performance penalty can be severe (e.g. imagine we have a thousand circuits defined per provider). So the crux of the matter for me at the moment is deciding how to control this; my two options cited above.

I find myself leaning more toward the second option, if only because working with a single serializer per model is going to make it very difficult (likely impossible) to avoid cyclical imports. For example, this obviously won't work:

class ProviderSerializer:
    circuits = CircuitSerializer(many=True)

class CircuitSerializer:
    provider = ProviderSerializer()

But creating a new "full" serializer by subclassing the existing serializer should:

class ProviderFullSerializer(ProviderSerializer):
    circuits = CircuitSerializer(many=True)

class CircuitSerializer:
    provider = ProviderFullSerializer()
jeremystretch commented 3 years ago

Thinking this through a bit further, I think we'll run into an issue in the reverse direction: traversing upstream related objects. For example: Interface -> Device -> Rack -> Site -> Region -> Region -> Region. It may just not be worth pursuing the parallel to DRF any further, owing to the complexity of our data model.

PaulGilmartin commented 3 years ago

@jeremystretch Perhaps it's getting a bit late here and I'm not completely following, but I believe the problem you're describing is very similar to the one I discuss here https://github.com/PaulGilmartin/graph_wrap#which-problems-does-graphwrap-address. If so, it shouldn't be the case that you need to embed the CircuitSerializer inside the ProviderSerializer

class ProviderSerializer:
    circuits = CircuitSerializer(many=True)

graph_wrap should be smart enough to get the circuits_type purely from a HyperlinkedIdentityField (as we describe in the link above with the Authors/Posts example). So you could have something like

class ProviderSerializer:
    circuits = serializers.HyperlinkedRelatedField(
          view_name='circuit-detail', read_only=True, many=True)

and that should give you a full representation of the nested circuits object via /graphql but not via the REST endpoint (providing circuit-detail does point to some sort of appropriate CircuitSerializier). Have you tried this at all?

jeremystretch commented 3 years ago

I did give that a shot early on, but it seems to just load the related objects as strings:

"message":"Field \"circuits\" of type \"[String]!\" must not have a sub selection."

Even if it was able to generate the serializer dynamically, I'm not sure we'd want to give up the control over the displayed object.

PaulGilmartin commented 3 years ago

Not sure why that error came about, but it wouldn't be generating a serializer dynamically. What it does is fetches the serializer associated to the view_name. So for example if we wrote

class ProviderSerializer:
    circuits = serializers.HyperlinkedRelatedField(
          view_name='circuit-detail', read_only=True, many=True)

then the view_name 'circuit-detail' must be associated to a pre-defined CircuitSerializer. This would mean you don't need to explicitly embed it in the ProviderSerializer (which I understood to be your main concern).

I'd be happy to look at the code which gave the error you mentioned if you want to explore this further.

jeremystretch commented 3 years ago

Ah, I see. For whatever reason that doesn't seem to be working at the moment. Ok, it may be worth coming back to that.

However, I think the bigger issue is figuring out how to navigate upstream relationships. Consider this model:

Our current InterfaceSerializer defines its device as a nested serializer, traversing only one level up in the hierarchy. We want to keep this as-is in the REST API serializer. So to support GraphQL, we would need - as far as I can tell - to introduce a separate serializer which defines device as a full serializer, which in turn defines its rack as a full serializer, and so on. Is that right, or am I missing a more elegant solution?

PaulGilmartin commented 3 years ago

@jeremystretch My first point would be that to support GraphQL, we don't necessarily need to have available full representations of every object to the client. GraphQL can still be of great use when only the limited scope of nested objects is ever exposed (as you do with NestedXXXSerializers).

If however your aim of using GraphQL is to be able to expose full representations of nested objects, then I think doing something as I described above wouldn't be too much effort. Let's continue with the interface-device example above. Let's just suppose that the REST API exposes interfaces at /interface/{id}. Currently that's modelled like this:

class InterfaceSerializer(PrimaryModelSerializer, CableTerminationSerializer, ConnectedEndpointSerializer):
    url = serializers.HyperlinkedIdentityField(view_name='dcim-api:interface-detail')
    device = NestedDeviceSerializer()

So a request to fetch an Interface would give JSON of the form

{url:  /interface/{id}, device: {nested device limited representation}}

Now, I see in the codebase that what I think you describe as a "full" serializer already exists for Device:

class DeviceSerializer(PrimaryModelSerializer):
    url = serializers.HyperlinkedIdentityField(view_name='dcim-api:device-detail')
    device_type = NestedDeviceTypeSerializer()
    ...

What we want :

My claim is that this is possible, provided your DeviceSerializer is already exposed via your REST API. Let's assume it is for now, at the URL /device/{id} and corresponding view name device-detail. Then if you add the following field to InterfaceSerializer

class InterfaceSerializer(PrimaryModelSerializer, CableTerminationSerializer, ConnectedEndpointSerializer):
    url = serializers.HyperlinkedIdentityField(view_name='dcim-api:interface-detail')
    device = NestedDeviceSerializer()
    device_2 = serializers.HyperlinkedRelatedField(view_name='device-detail', read_only=True) # new field

then graph_wrap should (i.e. it's a bug if it can't) be able to fetch the full representation of the nested device just from the fact the device url is exposed. Importantly, graph_wrap would make use of the existing DeviceSerializer to generate this full representation. Moreover, adding this device_2 field does not expose the full representation to /interface/id - it simply exposes the nested device URL.

So what if DeviceSerializer is _not already_ exposed via your REST API. Then I would claim that it's perhaps inappropriate to expose it on the /graphql API and we should stick with just exposing the limited NestedDeviceSerializer representation there too.

jeremystretch commented 3 years ago

GraphQL can still be of great use when only the limited scope of nested objects is ever exposed

Agreed, however I think the expectation from our user base is the ability to fully traverse these relationships.

Let's assume it is for now, at the URL /device/{id} and corresponding view name device-detail.

Just confirming that this is a correct assumption.

graph_wrap should (i.e. it's a bug if it can't) be able to fetch the full representation of the nested device just from the fact the device url is exposed.

This is very helpful to know, thanks! I'm going to continue experimenting with this a bit more as time allows and see what I can work out. Thanks again for all your help!

jeremystretch commented 3 years ago

After a lot more exploration, I've decided to push forward just using graphene_django for this implementation. I have a very rough PoC going int the 2007-graphql branch, which is currently limited to the circuits app. I still need to work out support for custom fields, tests, documentation, etc. Once that's all in place, we can extend it to the entire application.

It would be extremely helpful if some of the folks who have expressed interested in GraphQL would take this opportunity to help test. As with every major release, we'll hold a beta period, but the sooner we can act on proposed modifications to the implementation, the better off we'll be.

jeremystretch commented 3 years ago

One consideration that's come up is whether we should namespace the GraphQL entry points with the application name for each model: for example, dcim_device instead of just device. This isn't strictly necessary today as we don't have any models with overlapping names, however this is a future possibility. Take ipam.Role and dcim.DeviceRole, for instance. We've discussed in the past renaming the later to dcim.Role, which would introduce a conflict (though ultimately decided to leave it as-is).

I don't have a strong preference either way, though I want to avoid needing to substantially redefine the GraphQL schema in a future release. Curious to hear any opinions on the matter.

jeremystretch commented 3 years ago

Another issue I've run into: Initially, I planned to use the singular form of a model's verbose name to fetch a single object, and the plural form to fetch a list (e.g. site(id:123) and sites(status:"active"). However, this breaks down when we encounter a model with identical singular and plural names, namely dcim.VirtualChassis.

We can work around this by perhaps appending _list to the model name instead, but I'm not sure what convention is here. I don't have a strong preference, however I do want to avoid any corner case hacks. For example, if we opt to use the _list approach, we do so for all models, not just virtual chassis.

sdktr commented 3 years ago

Are you referring to how naming is represented in the schema, or internally within netbox?

If external/schema, I wouldn't grasp the need to expose singles. If a single site is requested it can be signaled in the qcl query?

Sites(name: "ams1")
Returns 1 result if name is literal match

vs:
Sites(name__ic: "ams1")
Works as a search, can result in matching ams1, ams11, ams123
jeremystretch commented 3 years ago

@sdktr I think there's a lot of value in providing an interface to grab specific objects, just as with the REST API. It covers the (fairly frequent, I'd expect) case where the client expects to receive exactly one object, and that zero or 2+ objects should result in an error. It saves the client from having to implement that logic locally.

jeremystretch commented 3 years ago

I ended up just appending _list to the query name for list fields. Curious to see the feedback during the v3.0 beta, but it suffices for now at least.