jimmoores / quandl4j

Java wrapper for Quandl REST API
quandl4j.org
Apache License 2.0
78 stars 22 forks source link

Automatic Iteration of Paged Search Results #14

Open hkothari opened 8 years ago

hkothari commented 8 years ago

Hey Jim,

I'd like to propose building in some facilities for automatic iteration of paged results in order to operate on a complete set of results when the responses are paged. The impetus for this is once again the search endpoint. I'd love to be able to get all of the results for a given database and for many databases which have thousands of datasets the limitation of 100 results at a time results in tons of paging needing to be done.

I'd love to build in functionality to handle this transparently and automatically page through results in serial or parallel. I imagine this as an AutoPagingSearchResult class which is returned by an "autoPagingSearch" method on QuandlSession. It would take in an AutoPagingSearchRequest which is similar to SearchRequest but leaves out the page number field and instead has the user specify serial/parallel and, if necessary, number of threads for the parallel requests.

What do you think of this proposal?

Best, Hamel

hkothari commented 8 years ago

Hmmmm, I realize for my specific needs, the datasets_list endpoint which returns a zip of a csv is probably a better fit https://www.quandl.com/docs/api#dataset-list (shame on me for not digging into the docs more), still, this might be cool to add.

jimmoores commented 8 years ago

I've noticed odd behaviour when your have a large number of pages. I used to use the empty query with a random page offset to pick test data for the unit tests, but that stopped working a few months back. That's probably partly that querying a page in a result set tends to make the database need to realise all the rows before that page. Agree it could be a nice thing to add - perhaps as an iterator?