aws / aws-sdk-ruby

The official AWS SDK for Ruby.
https://aws.amazon.com/sdk-for-ruby/
Apache License 2.0
3.56k stars 1.22k forks source link

CloudSearchDomain.search not populating pager #984

Closed onyxraven closed 8 years ago

onyxraven commented 8 years ago

It looks like either the CloudSearchDomain search endpoint should be pagable and is not filling in the Pager object, or it shouldn't, and it should not mix in PageableResponse.

Example:

@search_client ||= Aws::CloudSearchDomain::Client.new
out = @search_client.search(query: 'test', query_parser: 'simple')
p out.class # => Seahorse::Client::Response < Object
p out.pager # => nil
p out.kind_of?(Aws::PageableResponse) # => true
out.each { |p| }
# => NoMethodError: undefined method `truncated?' for nil:NilClass
# => from .../aws-sdk-core-2.1.33/lib/aws-sdk-core/pageable_response.rb:48:in `last_page?'

From the changelog I expected #each to throw immediately in this case

However, I expected CloudSearchDomain.search to be pageable, since it has a cursor to get to the next page of results http://docs.aws.amazon.com/cloudsearch/latest/developerguide/search-api.html - http://docs.aws.amazon.com/cloudsearch/latest/developerguide/paginating-results.html

awood45 commented 8 years ago

It does look like the pagination definition document is missing, that's where you're running in to problems. I'll look in to this.

awood45 commented 8 years ago

This is going to take some extended investigation on our end, since the API response doesn't give us all of the values we need to write a complete pagination definition easily.

In the meantime, you can check pagination manually. Consider the following (against the IMDB sample data):

Aws> resp = client.search(query: "star", query_parser: 'simple', return: "title", cursor: "initial")
[Aws::CloudSearchDomain::Client 200 0.367792 0 retries] search(query:"star",query_parser:"simple",return:"title",cursor:"initial")
=> #<struct Aws::CloudSearchDomain::Types::SearchResponse
 status=#<struct Aws::CloudSearchDomain::Types::SearchStatus timems=5, rid="requestid=">,
 hits=
  #<struct Aws::CloudSearchDomain::Types::Hits
   found=85,
   start=0,
   cursor="cursorresponse",
   hit=
    [#<struct Aws::CloudSearchDomain::Types::Hit id="tt1411664", fields={"title"=>["Bucky Larson: Born to Be a Star"]}, exprs=nil, highlights=nil>,
     #<struct Aws::CloudSearchDomain::Types::Hit id="tt1911658", fields={"title"=>["The Penguins of Madagascar"]}, exprs=nil, highlights=nil>,
     #<struct Aws::CloudSearchDomain::Types::Hit id="tt0086190", fields={"title"=>["Star Wars: Episode VI - Return of the Jedi"]}, exprs=nil, highlights=nil>,
     #<struct Aws::CloudSearchDomain::Types::Hit id="tt0056687", fields={"title"=>["What Ever Happened to Baby Jane?"]}, exprs=nil, highlights=nil>,
     #<struct Aws::CloudSearchDomain::Types::Hit id="tt0120601", fields={"title"=>["Being John Malkovich"]}, exprs=nil, highlights=nil>,
     #<struct Aws::CloudSearchDomain::Types::Hit id="tt1674771", fields={"title"=>["Entourage"]}, exprs=nil, highlights=nil>,
     #<struct Aws::CloudSearchDomain::Types::Hit id="tt2141761", fields={"title"=>["The Blackout"]}, exprs=nil, highlights=nil>,
     #<struct Aws::CloudSearchDomain::Types::Hit id="tt0258153", fields={"title"=>["S1m0ne"]}, exprs=nil, highlights=nil>,
     #<struct Aws::CloudSearchDomain::Types::Hit id="tt0397892", fields={"title"=>["Bolt"]}, exprs=nil, highlights=nil>,
     #<struct Aws::CloudSearchDomain::Types::Hit id="tt0069945", fields={"title"=>["Dark Star"]}, exprs=nil, highlights=nil>]>,
 facets=nil>

If you see that the number of hits + the start index is less than the "found" part of the response, you know you have a truncated response and can continue to use the cursor to make additional queries.

Leaving this open for a bit because I'd like to build in an answer to this problem.

awood45 commented 8 years ago

From a pagination perspective, our spec doesn't (yet) have a way to handle the structure of the responses consistent with the pagination pattern.

It is possible, it's just going to be a matter of either writing a CloudSearch-specific customization, or enhancing the pagination spec.

In the meantime, you can do pagination, you just have to pass in a cursor or increment the start parameter by the number of objects in the hit array. Unfortunately, that does mean that you'd have to roll your own #each method.

Tracking this for now as a feature request.

awood45 commented 8 years ago

Added this to the feature request backlog. Please feel free to reach out if you have any questions or difficulties using the response to page through your search results.

onyxraven commented 8 years ago

Thanks. In the short term, can you ensure the api doesnt ~think~ this is pagination enabled? I ran into this by assuming it was because each() didn't immediately throw, but then in practice and especially in test/stubbed, thats when it threw the error.

awood45 commented 8 years ago

Actually, I would like to investigate that bit. The way the lack of a pagination spec is presenting isn't too helpful.

onyxraven commented 8 years ago

Maybe a consistent way to indicate it in the documentation would definitely help. Let me know if thats something I could open a PR for.

awood45 commented 8 years ago

No, I think there's a bug in here to investigate, it shouldn't be treating this like a PageableResponse with no pagination config defined.

Defining the pagination spec is a feature request, the bug is why it's presenting in this manner.

awood45 commented 8 years ago

What appears to happen is that Aws::Plugins::ResponsePaging is a default plugin, whether or not pagination is defined for a given client. This is why PageableResponse appears in the ancestor list.

awood45 commented 8 years ago

Aws::Pager meanwhile, is nil in this case, so we'll always hit this particular piece. That's the core issue.

awood45 commented 8 years ago

I have a proposed patch for this that I'm getting reviewed. Hoping to push that tomorrow to go out with the next release.

awood45 commented 8 years ago

This fix will go out with the next release, and won't make non-pageable operations appear pageable.