fifemon / graphql-datasource

Grafana datasource plugin to query data from a GraphQL API
https://grafana.com/grafana/plugins/fifemon-graphql-datasource
Other
63 stars 35 forks source link

Support Nested Relationships in Return #15

Open calebsmac opened 4 years ago

calebsmac commented 4 years ago

This is a noob issue so I apologize if there's already a way!

Basically, my data comes back with nested lists. For example, return might look like this in query inspector:

Screen Shot 2020-05-04 at 9 53 23 AM

In prose, the query returns a series of measurement objects, each of which has an associated timestamp and several sub-measurements (e.g. voltage, current, and temperature)

Ideally I'd group by the name (embedded in the lowest-level object) but that looks like: node.fieldValues.edges.0.node.field.name, node.fieldValues.edges.1.node.field.name, etc.

And then the value is in: node.fieldValues.edges.0.node.value.Data, node.fieldValues.edges.1.node.value.Data

I can imagine a situation where groupby etc allows a wildcard. However I'm not sure if the time would be carried down appropriately.

BTW - this project is very important to me so I'm eager to possibly contribute. However I'm a total Grafana noob so my contributions may be quite slow.

retrodaredevil commented 4 years ago

Currently I don't believe this is possible with the current setup.

I think the simplest change to make this possible would be to add something called "nestedPath" and "nestedGroupBy". For instance, the nested path in this example could be "fieldValues.edges" and the nested group by could be "node.field". I also believe that with your exact query, a time data path may have to be added because the Time field is nested in the node object.

The result of your query is pretty unique and also not very standard from a GraphQL perspective because GraphQL is strongly typed. You're basically getting an array of field names and values rather than an object where you can pick what values you actually want. I'm guessing it's not possible to change the GraphQL schema or at least create another query on the server side so everything stays compatible? Ideally, you could have a query like this and it wouldn't be as complicated:

{
    someQuery {
        edges {
            Time
            voltage
            currrent
            temperature
        }
    }
}

Making the Time path customizable is pretty simple, so that can be done pretty easily, but adding nested datapath and nested group by will make queries that much more complex. Maybe if that's the way we decide to go we could hide the nested configuration by default.

Maybe @retzkek has an opinion on the best way to do this. If you are interested in contributing, you'll probably be editing https://github.com/fifemon/graphql-datasource/blob/master/src/DataSource.ts and will likely edit https://github.com/fifemon/graphql-datasource/blob/master/src/types.ts and https://github.com/fifemon/graphql-datasource/blob/master/src/QueryEditor.tsx

calebsmac commented 4 years ago

Thanks for the feedback @retrodaredevil. I created https://github.com/fifemon/graphql-datasource/issues/16 for the time path. Interested in your feedback there, that might be a first task I take on...

retzkek commented 4 years ago

This was a concern of mine when I first started this project - GraphQL queries can be infinitely complex, requiring an equally complex configuration system to support the general case, so I quickly decided to just focus on the simple case, which I expected would meet a majority of use cases.

However, this edge/node relationship model is definitely common, especially since it's supported by Relay. The good news at least is that it's a de-facto standard, so we're not talking about supporting arbitrarily nested structures. I don't immediately have a good idea how best to support it though.

We also will then have to think about how to handle pagination. Playing around with the GitHub API for instance I hit a limit of 100 results for page for Security Advisories, so we'd have to request enough pages to fill in the dashboard time range.

I'm definitely looking for help on this.

universalappfactory commented 4 years ago

Hi,

I had the same issue. I solved it by adding a 'Map' function, where I can transform the resulting object:

I'm having queries like this:

{ numericValues(created_Gte: "$timeFrom", created_Lt: "$timeTo") {edges{node{Time: created, value}}} }

with something like this as result:

{ "data": { "numericValues": { "edges": [ { "node": { "Time": "2020-08-05T05:12:35.027698+00:00", "value": 8.81 } } ] } } }

And can now provide a custom map function in Grafana (I added a field below you AliasBy) where you can provide a custom function:

return {Time: input.node.Time, value: input.node.value}

Not sure if there are some security issues as I'm using the javascript Function() class and perhaps some convention like mapping is better?

If this is something pointing to the right direction of this issue I'll make a pull request.

calebsmac commented 4 years ago

This sounds powerful and quite useful to me, would solve a lot of problems. I don't know about concerns on use of Function though... Arbitrary JS is disabled by default on the text widget, though personally I do have it enabled.

Even if this isn't accepted I'd be interested in seeing your code if you're willing to share it

universalappfactory commented 4 years ago

Yes sure, it's just living in my fork: https://github.com/universalappfactory/graphql-datasource/tree/feature/relay

retzkek commented 4 years ago

Hey @calebsmac thanks for contributing. So the idea of executing arbitrary JS is definitely a bit worrisome. However, and perhaps this is just a case of a simple example, but wouldn't #16 actually address your use case? In this example the data path would be numericValues.edges then you could configure the time path to be node.Time.

calebsmac commented 4 years ago

@retzkek I don't see how it does, since I have nested arrays here.

Each measurement has several associated sub-measurements, and those have a name and a value property. So if I set my data path to be at the level which includes the timestamp, I have an array as the data - data.0.name = voltage, data.0.value = 3.3, data.1.name = current, data.1.value = 420

But I think @universalappfactory 's solution would give the flexibility I need, along with some other VERY nice features like the ability to convert 420 mA to A, apply a calibration factor, etc...

I haven't contributed anything yet for the record, except for use cases :)

retzkek commented 4 years ago

d'oh, I meant to tag @universalappfactory not @calebsmac , sorry! 🤦

retzkek commented 4 years ago

along with some other VERY nice features like the ability to convert 420 mA to A, apply a calibration factor, etc...

@calebsmac have you looked at the "Add field from calculation" transformation (added in Grafana 7)? The "binary operation" mode should meet both those use cases, being able to apply an operation to two fields or a field and a constant.

Anything more complex than that and you may need to look at making a custom plugin to be honest.

universalappfactory commented 4 years ago

Hi @retzkek

yes, #16 is probably a better solution, at least for my use case. I'll try if it works for me.

yohoe commented 3 years ago

It seems to me that this and similar problems could be solved by enhancing the path- and field syntax:

  1. Allow wildcards (for array indices) in the data path, e.g. edges.*.node.fieldValues.edges.*.node.value
  2. Allow parent references in the time path, group by and alias configurations, e.g.
    • time path: ^.^.^.^.^.Time (going up 5 levels, then to the Time field)
    • group by: ^.field.name,^.^.^.^.^.^ (field name and index in outer array)
    • alias: $field_^.field.name $field_^.^.^.^.^.^ (yielding voltage 0, voltage 1 etc.)

An alternative for 2. would be to interpret the time path, group by and alias configurations relative to the data root, but this would break dashboards.

My use case involves time series data nested below the data point description:

"properties": [
  {
    "label": "M31",
    "series": {
      "entries": [
        { "Time": 1322697600000, "value": 1.1},
        { "Time": 1322784000000, "value": 1.2},
        ...
      ]
    }
  },
  {
    "label": "M32",
    "series": {
      "entries": [
        { "Time": 1322697600000, "value": 2.1},
        ...
      ]
    }
  }
]

I can extract the time series data for one data point with the plugin as-is, but the data path has to point to the nested array (properties.0.series.entries), leaving me unable to reference the label or include both data points. The changes i proposed above would allow me to extract all time series, labeled, at once, in a generic manner.

Edit: I realize that i can configure multiple data paths, if i know the number beforehand. Solving the label problem and the generic case would still be nice, though.

tyler-dunkel commented 2 years ago

This problem is certainly something my team is interested in! @retzkek are you still looking for help on solving this? We also would really benefit from having some way to plot values based on an array that is nested in another array.

alisters commented 2 years ago

@retzkek - i realise you have appealed for a new maintainer for this repo, but hoping you may still answer this comment. Is this something that could be solved by making the group by values jsonpath expressions ? I'm not a node/typescript guru, and this is my first look at Grafana and graphql. I was using it to explore creating a better operations dashboard for gitlab pipelines, whose graphql model has this nested array structure - my initial query is below { group(fullPath: "graphqltest") { name projects { nodes { name pipelines { nodes { status id createdAt duration jobs { nodes { name status stage { name } } } } } } } } }

I didn't have any success with the current group by expressions - i'm still looking at the code to see if i'm doing something wrong. I tried a bunch of different paths, but couldn't get the data set to have more than 1 row image

alisters commented 2 years ago

Irrespective I've taken a fork of the repo and started to give it a try. I'll post back how it goes

EvertonSA commented 2 years ago

hi @alisters , welcome to the team of people trying to make sense on the gitlab graphql. I have a similar situation where I have all data with just one graphql query but i need it in multiple rows. for now, im duplicating pannels. my solution sucks. i tried all available transformations but 0 success so far. did you manage to split results into rows?

image

alisters commented 2 years ago

@EvertonSA - i did not manage to split the data into two rows as yet. I raised a new issue https://github.com/fifemon/graphql-datasource/issues/84#issue-1417598690

I don't know if that is the right thing to do, but we have this issue, and another https://github.com/fifemon/graphql-datasource/issues/58 - that both seem to be subsets of generally supporting apis structured according to the connection specification - both their pagination (issueing multiple queries) and parsing of nested arrays - this issue.