google / go-github

Go library for accessing the GitHub v3 API
https://pkg.go.dev/github.com/google/go-github/v63/github
BSD 3-Clause "New" or "Revised" License
10.29k stars 2.04k forks source link

Paginated client.Issues.List #22

Closed yml closed 11 years ago

yml commented 11 years ago

The github api documentation shows that the /issues api endpoints are paginated.

The pagination is information is returned in the header like this:

Link:[<https://api.github.com/issues?direc
tion=desc&filter=all&labels=&sort=updated&state=open&page=2>; rel="next",

Unfortunately as far as I can tell the go-github package do not expose this information. listIssues does not return the Headers nor the response. What is in your opinion the best approach to fix this issue ?

To put it in a nutshell I can't think of an easy way to make a backward compatible changes to fix this bug.

willnorris commented 11 years ago

yeah, there's definitely not a great way to do this. In my original design for the API, I had all of the methods returning a three-part response, which included the http.Response. So something like:

func (s *IssuesService) List(all bool, opt *IssueListOptions) ([]Issue, http.Response, error)

I seem to recall that I changed it at @nf's suggestion, but now I'm not finding that email thread, so maybe I did it myself. Since GitHub uses index paging (rather than pagination tokens), you can actually construct the paginated requests yourself. Though that does potentially result in an extra request when you reach the end of the set.

Another approach could be to add an additional flag to the Options struct which causes the library to handle all the pagination for you, and just return the final complete set of results. In that case, we'd probably want to change the response to be a channel, rather than blocking on however many requests it would end up taking.

Since the library is still has a very limited set of users (as far as I know), I'm not too concerned about making breaking changes.

willnorris commented 11 years ago

/cc @sqs @wlynch92 @ktoso in case they have thoughts.

yml commented 11 years ago

Thanks for your quick reply I think the approach that I like the most is the one from your original design API where you return the http.Response. This let anyone built on go-github.

I am not sure I like the approach where you return a complete set of results because this might be long and inefficient if you just want to know how many pages you have or if you have a next page. In my use case I would like to retrieve the first 50 results and to know if there is more.

Thinking a bit more about it I can see a mix of the 2 approaches working well:

func (s *IssuesService) List(all bool, opt *IssueListOptions) ([]Issue, http.Response, error)

With this in hand I thing I can do pretty much anything I can think of.

wlynch commented 11 years ago

I agree with @yml that returning all of the results could be costly for certain projects, even if it is a channel.

However, exposing the http.Response in functions outside of the client's http-like functions (NewRequest, Do, etc.) seems inconsistent with the rest of the API since we never take in or return a http struct anywhere else in the library. Instead of the http.Response do we just want to return a struct that contains the parsed pagination information? For example:

type Page struct {
URL string;
Next int;
Prev int;
First int;
Last int;
}

And instead have: func (s *IssuesService) List(all bool, opt *IssueListOptions) ([]Issue, *Page, error), returning *Page=nil if there are no pages to display. With this we could even do things like Page.Next() since the HTTP query parameters seemed to be preserved in the response header.

yml commented 11 years ago

@wlynch92 there is few other headers that can be very useful to understand what is going on for example let you know where you are with regards to the ratelimiting X-Ratelimit-Remaining:[4999].

Pagination is not only for Issues but for many ressources. I have the feeling that even if returning Page struct fix my immediate needs that soon you will need to add a RateLimiting, Etag, or an AccessControlAllowOrigin and who knows what next.

At least returning response.Header back on all the api calls would be really handy and since we are already breaking the api why not just return the full response to be sure that no one miss something in the future.

willnorris commented 11 years ago

@yml the rate limit headers are already used to update the client.Rate on each request, and I had hoped to have a separate http client cache that would transparently handle Etag and such, but your point is well taken. I agree that simply returning the full http.Response will be the most flexible. Actually, what I think I had done before was to return a custom Response struct that wrapped the http.Response, and I then added some extra convenience methods on that. How does that sound?

yml commented 11 years ago

@willnorris this sounds fantastic and some syntactic sugar to Parse the header LInk will be more than welcome. thanks

willnorris commented 11 years ago

So, are the full header Link urls the right thing to return? What would you do with them? You could then construct your own http.Request, make the API call to GitHub, manually parse the response, etc. But as it's currently designed, go-github is not really a RESTful client, so it doesn't make it easy for you to just provide it a URL and have it do the right thing.

I think what may be more useful is to actually parse the Link header values and return the value of the page parameter, so that you can then update your ListOptions object, and submit the next request the normal way. So maybe something like:

type Response struct {
    http.Response
}

func (r *Response) NextPage() int
func (r *Response) PrevPage() int
func (r *Response) FirstPage() int
func (r *Response) LastPage() int

If you really want the full URL, you can call the normal Headers() method.

willnorris commented 11 years ago

or maybe those should really be strings rather than ints, just in case GitHub ever deviates from straight index-based paging?

yml commented 11 years ago

What you are describing above with the type Response struct this similar to what I was imagining. So the method signature will look like something like this :

func (s *IssuesService) List(all bool, opt *IssueListOptions) ([]Issue, Response, error)

or maybe those should really be strings rather than ints, just in case GitHub ever deviates from straight index-based paging?

This would be silly and it seems to me that this would be time for them to call it V4 and then go-github could update its API to follow.

yml commented 11 years ago

I forget to mention earlier because it was obvious in my mind but IssueListOption will also be extended with a new Page int field.

This field will be used to get the issues that are on the page X.