Quantumplation commented 2 years ago

Currently the APIs only support fetching the first page;

I plan on implementing this on my fork, and am happy to submit a PR.

My planned API, let me know if you'd accept a PR with the interface below:

Each (paginated) API call returns PaginatedResult<T, Error>
PaginatedResult<T, Error> implements IntoIter to just loop over everything with default page sizes
PaginatedResult<T, Error> also provides .items(count, start, order) method to return the same iterators with different settings / starting pages, for a bit finer grained control
PaginatedResult<T, Error> also provides .pages(count, start, order) method to return an iterator that returns Result<Vec<T>, Error>, each page at a time

Example usage:

for item of api.accounts_history(addr).into_iter() {
  println!("{:#?}", item);
}

for item of api.accounts_history(addr).items(150, 3, "desc") {
  println!("{:#?}", item);
}

for page of api.accounts_history(addr).pages(150, 3, "desc") {
  for item in page {
    println!("{:#?}", item);
  }
  api_limit -= 1;
  if api_limit >= 0 {
    break;
  }
}

marcospb19 commented 2 years ago

Supporting pagination is a must, I'll make some comments regarding the proposed interface and the issue itself.

Query parameters

I plan on adding query parameters settings to the struct Settings that goes with BlockFrostApi, including count, start, order, from and to, so you could use the builder pattern to customize it, with everything set to None by default.

If this gets implemented, we would be able to write something like (this is a scratch of what I'm planning):

let api = create_api();
let lister = api.clone().configure(|query| query.set_order(Descending).set_count(50));

let mut _iter = lister.account_history_all().await;

So that would take care of the finer control, some feedback on this alternate solution would be appreciated!

PaginatedResult

If we use the js implementation as a reference, methods would have an alternate version suffixed with _all that supports pagination.

let item = api.accounts_history(addr).await?;
let iter = api.accounts_history_all(addr).await?;

I would prefer this approach instead, because (if I understood correctly) PaginatedResult<T, Error> needs to be .awaited and could have already returned one error before starting the iteration. If not .awaited then the methods would need to be implemented over an impl Future<Output = PaginatedResult<T, Error>>.

I'm still hesitant on if this should be implemented with an concurrent or parallel solution.

Rate limits and retrying

Another thing that we should take into account, this library will have support for retrying with a custom delay when the rate limits are reached, so the concurrent (or parallel) solution must wait on the next element even if the others are ready, so we can guarantee to the user that the results will all be given in correct order, this will also require some extra work.

Quantumplation commented 2 years ago

If you have opinions on the interface, happy to conform to that. In the interest of exploring that further, I think the main criteria I would have are ensuring that:

It's easy to just get everything if you don't care
It's easy to work page by page if you need to

So in the above, how would the user go about fetching the second page on the non-all version?

let mut page = api.accounts_history(addr).await?;
loop {
  for item in page {
    // ...
  }
  // Does the `api` object keep track of the current page per endpoint?
  page = page.api.accounts_history(addr).await?;
  // do I have to call api.set_start, and then reset it for a different API call?
  api.set_start(page.len());
}

Additionally, are we assuming the pagination settings apply to every API call? I think that seems reasonable, except for the start bit.

What I was trying to capture with the paginated result is: calling the API returns an iterator-like object that can be used to progress and store your current settings/state. If I was ambitious, I'd just implement the rust Stream trait, but I'm not sure if that's super stable yet.

As for the api and api_all versions, it should be pretty simple to make a macro or two to reduce some of the boilerplate.

marcospb19 commented 2 years ago

I'll opt for implementing the Stream trait, this one specifically:

https://docs.rs/futures/0.3.17/futures/prelude/trait.Stream.html

Seems like the best option as std::stream::Stream is not stabilized.

Because of the limitations of for, it will require while let to be used:

let mut account_history_lister = api.accounts_history_all(addr);

while let Some(page) = account_history_lister.next().await {
    dbg!(page);
}

For the concurrency part, I'll be using https://docs.rs/futures/0.3.17/futures/stream/struct.FuturesOrdered.html.

marcospb19 commented 2 years ago

I forgot to answer previously:

// Does the api object keep track of the current page per endpoint?

Yes, if you're using different endpoints, then you may take care to not use the page counter from one to another.

However, the suggested in this case would be to clone the api into a lister.

let api = create_api();

let account_history_lister = api.clone();

And you may want to configure it too, there are two ways to configure with the &mut self builder pattern.

// First option
let mut account_history_lister = api.clone();
account_history_lister.set_page(910);

// Second option
let account_history_lister = api.clone().configure(|api| api.set_page(910));

Here's the source code for configure.

// do I have to call api.set_start, and then reset it for a different API call?

I will add a convenient pass_page() (or increment_page()) method for the api, so you can call it at the bottom of the loop.

let mut account_history_lister = api.accounts_history_all(addr);

while let Some(page) = account_history_lister.next().await {
    dbg!(page);
    api.increment_page();
}

Can be useful if you want the api to be always pointing at the next page, instead of storing page somewhere else.

I opted for not doing it automatically to avoid confusion/spaghetti code (and borrow checker rules breakage).

NOTE: that increments the page field for api, but not for account_history_lister, the latter already contains an automatic incremental counter for the listing.

marcospb19 commented 2 years ago

This was hard, but it's done!

The function blocks_previous_all now returns a Lister<Vec<Block>> that implements Stream and always has 10 concurrent jobs running.

Now we just need a macro for implementing it for the remaining page requests.

Example usage at https://github.com/blockfrost/blockfrost-rust/blob/master/examples/lister.rs, thanks for your tips on the interface!

blockfrost / blockfrost-rust

Support Pagination #2

Query parameters

PaginatedResult

Rate limits and retrying