hub4j / github-api

Java API for GitHub
https://github-api.kohsuke.org/
MIT License
1.13k stars 723 forks source link

GitHub v4 GraphQL API support #521

Open bitwiseman opened 5 years ago

bitwiseman commented 5 years ago

The v4 API provides a much more customizable API based on GraphQL. It is out of scope for this library to provide a generalized API the fully leverages the power of GraphQL, but having some way to construct queries for trees of items would be useful.

https://developer.github.com/v4/

bitwiseman commented 4 years ago

https://github.com/hub4j/github-api/pull/791#issuecomment-619786105

To connect to GH v4 API Im using "Allegro" framework (https://github.com/apollographql/apollo-android)

Calls itself are rather simple.. Only part that Is common to Github and I hat to write small wrapper over framework was paging and rate limits..

For now Im just using GraphQL to:

  • Setup branch protections (v4 allows setting branch protection with rules not branch per branch so In my case for some repos I was able ro replace 500 rules for 5)
  • Get list of all users with SAML emails.. to check with our AD and remove ones that are not there anymore.
bitwiseman commented 2 years ago

As noted in #1319 Side note: The objects returned as part of events are NOT the same as the GraphQL version of those same objects, Discussions in that issue. https://docs.github.com/en/graphql/reference/objects#discussion .

GraphQL Nodes have id as ID! which is a Base64 string - https://docs.github.com/en/graphql/reference/scalars#id . The objects returned with events and the REST API return id as long. GraphQL has databaseId field on many objects that has the same value as REST API id . REST API has node_id field, which equals the GraphQL id value. 😠

However, you can't mix and match - REST API requires numeric ids to reference objects, while GraphQL requires ID! ids.

😡 😡

This means that when we start implementing and moving to GraphQL there will need to be some way to convert objects returned with events to their GraphQL equivalents. On the plus side, GraphQL supports renaming of returned fields, so we could force GraphQL queries to return their ID! field with the name node_id and also return databaseId.

bitwiseman commented 2 years ago

Breaking changes do occur in GraphQL API, but they are rare: https://docs.github.com/en/graphql/overview/breaking-changes

Non-breaking changes happen much more often: https://docs.github.com/en/graphql/overview/changelog

jgangemi commented 2 years ago

has there been any further thought about this? i have a feeling that access to dependabot alerts are only going to be available via graphql.

i guess there's nothing that would currently prevent me from using that endpoint as well...

bitwiseman commented 2 years ago

Yeah, there's a bunch of underlying work to be done here. But this might be a case where we start with something closer to the raw API and then make it more Java friendly over time.

What do you have in mind?

jglick commented 2 years ago

IIUC a key advantage of GraphQL is that you can generate type-safe client interfaces from schema. That sounds like a different library altogether, though there might be a bit of overlap (or at least a factored-out common library) to deal with authentication.

bitwiseman commented 2 years ago

Just generating the schema is not enough.

Authentication, rate limits, retrying, stable api that compensates for bugs and backward compatibility.

Also GraphQL offers more data but requires the client to specify the fields they want. Which means everyone can generate queries for exactly the data they need, but if they are not sure of what data is needed at query time (as with Jenkins, with its plugins), they have to request all possible data. Which defeats one of the key features of GraphQL.

Shrug.

jglick commented 2 years ago

if they are not sure of what data is needed at query time (as with Jenkins, with its plugins)

Perhaps; depends on what code is constructing the query and for what purpose. If it is exposing some sort of GH object for other unknown plugins to inspect in arbitrary ways, then yes this would defeat a key aspect of GraphQL.

gsmet commented 2 years ago

FWIW, in Quarkus GitHub App, I expose both the GitHub REST API using this very API and a low level GraphQL client. From my experience, both are useful for different things and they both have unique features. For GraphQL, I don't think I would imagine not writing my queries myself as they really need to be tuned and limited to the strict minimum to avoid raising costs (the GraphQL API rate limiting is cost-based).

samrocketman commented 1 year ago

Purpose

The purpose of using GraphQL is resolving the same information required by plugins but using GraphQL instead of the GitHub search API (which is inefficient).

You don't need a general purpose "plugins request whatever they want" kind of client. You simply need to provide the same metadata currently provided for plugins but using GraphQL instead of the GitHub search API.

Jervis has its own client

Jervis has its own client specifically for GraphQL because this library was not up to the task. I would recommend either using it or relying on something similar in concept.

Example

Query all branches and tags for jenkinsci/jenkins and resolve latest contributor metadata for each.

https://github.com/samrocketman/jervis/issues/133#issuecomment-1614036278

In the above code example you can change the following to get JenkinsCI repo.

-String githubOwner = 'samrocketman'
-String githubRepo = 'jervis'
+String githubOwner = 'jenkinsci'
+String githubRepo = 'jenkins'

This resolves the jenkinsci/jenkins Git metadata in under 10 seconds (the GH api library resolves the same information in a couple of hours using over 4000 API requests).