insights: graphql API has too much complexity between chart types and series types

coury-clark commented 2 years ago

In our GraphQL API we have mutations such as the following createLineChartSearchInsight that couple a visualization type (line chart) to a series type (search). This has a number of problems:

We have to support a combination of mutations for many visualization and series tuples
Evolving visualization options for chart or series is tedious and error prone
Introducing new chart types is very time consuming, even though the underlying data to render is already available

We should consider alternative approaches that will reduce this complexity. Some ideal outcomes may be:

We can freely evolve presentation settings without interacting with many different mutations. In other words, updating a line chart with a new setting should only require modifying one input object that relates to line charts.
We can more freely express charts that are composed of multiple series types. For example, a line chart might be composed of a search series and a (future) compute series.
We can support changing chart types on an insight (for example, line -> pie)

sourcegraph-bot-2 commented 2 years ago

Heads up @joelkw @felixfbecker @vovakulikov @unclejustin - the "team/code-insights" label was applied to this issue.

coury-clark commented 2 years ago

One possibility to consider is cleanly separating visualization mutations from data series mutations. To accomplish this we would do something along the lines of:

Define mutations per visualization type. Line chart, pie chart, etc.
1. One question might be, how does this interact if chart types are similar but not exactly the same? For example, stacked line charts? Does this become the same problem we are trying to solve in the OP?
2. An example might be updateLineChartAttributes
Resolve the appropriate attribute response objects in the resolver hierarchy based on the defined chart type
Introduce a new relationship from each insight to each visualization type of attributes, with all attributes having a defined default value.
(Optional) Introduce a new mutation to modify the visualization type (line -> pie)

An example of this relationship

This might raise some other questions though, for example:

How do we interact with modifying the series attributes? Are these separate mutations, something like updateLineChartSeriesAttributes, or does it roll up in the original line chart mutation?
Is it necessary to consider attributes that are not associated visually, but through a data series? For example, would setting an attribute on a search series be required, and if so, does this fundamentally break this model?

Joelkw commented 2 years ago

related to #33931

chwarwick commented 2 years ago

Leaving this here since it's since I'll consider it related. In the past when looking at graphql and dashboards I had come across this: https://devblogs.microsoft.com/cse/2017/09/28/data-independent-graphql-using-view-model-based-schemas/ and while I didn't end up needing to do that in the past, I found the approach interesting.

chwarwick commented 2 years ago

I found the approach interesting.

The part I think the most about is the separation of the definition of what queries to run to capture data and then how you want to visualize it.

unclejustin commented 2 years ago

Echoing @chwarwick 's comment above. I know (think?) we have to have slightly different data shapes for different chart types, but in general, if we could have a data query that stores the "information". And a visualization query that describes the chart and points to the data. That's what makes sense to me.

In the UI the data query maps to the "what". i.e. the search query And the viz query maps to meta information and links to the appropriate data query. Then, an update to the viz (like changing to a different chart) wouldn't need to refetch data if the data shapes were compatible.

vovakulikov commented 2 years ago

@unclejustin @chwarwick during our conversation with @coury-clark we agreed that we would probably find a solution for querying but we've got a few questions around mutations

How to handle mutation for different types of charts?
We could have series-like and categorical-like data shapes how to enforce that series-like shapes can't be represented in the categorical chart types
Do we need to split mutation for each chart type?
In theory, we want to have charts where one series is represented with a bar chart and another with a line chart how to handle this in mutation payload?

@coury-clark correct me if I missed something there

chwarwick commented 2 years ago

@vovakulikov What I was thinking and I haven't worked out all the details to tell if it's possible would be to look at the creation of a series as separate from how it's visualized.

If we look at the core pieces of data that makes up the series it does not specify how or limit us in the way that we eventually present that data back to the user:

Name
Search Query
How often data is refreshed
What repos it runs over

(The last two may be better kept at a higher place in the tree so that all series in the insight share them but there also there could be benefit from allowing them to differ)

The data from those series could be a line or bar chart with historical values, it could be a pie chart with only the most recent data.

The second piece would be then at the insight view level defining which series to include in the view and the default shape (chart type) that would want those series returned in. I think it would be possible make this a single type without having to create something with too many optional values that cover all case. A quick first pass could be something that looks similar to:

input SearchInsightInput {
    """
    The list of data series to create (or add) to this insight and how to display them.
    """
    dataSeries: [InsightDataSeriesDisplayInput!]!

    """
    Title of the insight.
    """
    title: String

    """
    The dashboard IDs to associate this insight with once created.
    """
    dashboards: [ID!]

    """
    The default values for filters and aggregates for this line chart.
    """
    viewControls: InsightViewControlsInput
}
enum ChartType {
    LINE
    BAR
    PIE
}
input InsightDataSeriesDisplayInput {
    """
    Options that cover all charts types (this would need to be exhaustive of any type of chart and most likely contain quite a few optional fields
    """
    options: ChartOptionsInput!
    """
    The default type of chart to display
    """
    chartType: ChartType
    """
    The series to be displayed
    """
    series: InsightDataSeriesInput

}
input InsightDataSeriesInput {
    """
    Unique ID for the series. Omit this field if it's a new series.
    """
    seriesId: String
    """
    The query string.
    """
    query: String!
    """
    The scope of repositories.
    """
    repositoryScope: RepositoryScopeInput!
    """
    The scope of time.
    """
    timeScope: TimeScopeInput!

}

If we were to do something like this it would be possible to combine chart types in once visualization (if they are compatable) as well as create multiple insights with re-using the same series but displaying them in different chart types. For example you could look at current python version over time and next to it place a pie chart using the same data which would look only at the most recent data point.

Joelkw commented 2 years ago

(For iteration 18, we want to get to potential specific implementation improvements, and even begin on them – things that will make our roadmap easier/we know are coming.)

CristinaBirkel commented 2 years ago

Google Docs doesn't have code blocks so I'm going to link to this comment in the RFC for this issue. (Pending.)

input PieChartSeriesInput {
    seriesId: String!
    otherThreshold: Float!
}
input LineChartSeriesInput {
    seriesId: String!
    label: String!
    stroke: String!
}
# .. more chart input types

input SearchSeriesInput {
    seriesId: String!
    query: String!
    repositoryScope: RepositoryScopeInput!
    timeScope: TimeScopeInput!
}
input WebhookSeriesInput {
    seriesId: String!
    url: String!
    timeScope: TimeScopeInput!
}
# .. more data generation input types

input SeriesPresentationInput {
    lineChart: [LineChartSeriesInput!]
    pieChart: [PieChartSeriesInput!]
    # .. more chart types
}
input SeriesDefinitionInput {
    search: [SearchSeriesInput!]
    webhook: [WebhookSeriesInput!]
    # .. more data generation types
}

input InsightViewInput {
    seriesOrder: [String!]!
    seriesPresentations: SeriesPresentationInput!
    seriesDefinitions: SeriesDefinitionInput!
    viewControls: InsightViewControlsInput!
    title: String!
}

Note on series ordering: If the series data/viz information is decoupled and batched like this, we need another way to specify ordering. (And in the case of an update, to specify added/removed series.) We had an id format for dashboards which allowed us to identify virtualized dashboards. I think we could do something similar here for new ids. That way new series could be denoted and ordered and still matched up with the series data/viz information in the other fields.

CristinaBirkel commented 2 years ago

Next steps:

Everyone is taking some time to think about proposed solutions; it's not a decision that we want to rush and are not going to force it this week.
New chart types and compute insights are coming up soon on the roadmap, which this work was meant to support. Instead, we'll push forward with that work with one-off solutions that will help inform how we want to evolve the API, and continue this discussion in parallel.

sourcegraph / sourcegraph-public-snapshot

insights: graphql API has too much complexity between chart types and series types #33933