hyperledger / aries-askar

Secure storage designed for Hyperledger Aries agents.
Apache License 2.0
58 stars 42 forks source link

Pagination support #232

Closed ff137 closed 1 month ago

ff137 commented 3 months ago

Does aries-askar support paginated queries?

Coming from the ACA-Py world, all I can see that's implemented there is fetch by filter, or fetch all.

When working with potentially millions of wallets in a group, pagination is a critical feature to support querying all wallets over HTTP.

If it's already supported in askar, I can work to implement it in the python wrapper + ACA-Py.

TimoGlastra commented 3 months ago

I think you can use scan methods, with limit and offset to implement pagination.

This is supported in the Python wrapper I believe, but not yet integrated into ACA-Py AFAIK.

Would be a great addition though!

I think only problem would be that you can't use cursors so it's prone to database changes while fetching next/previous pages

https://github.com/hyperledger/aries-askar/blob/main/src/ffi/store.rs#L542

ff137 commented 3 months ago

Nice! Thanks @TimoGlastra , that helps a lot I'll see what we can get implemented, soon™

swcurran commented 3 months ago

@esune @loneil — something to note when thinking about pagination support in ACA-Py.

andrewwhitehead commented 3 months ago

The Scan object returned over FFI is essentially a forward-only cursor on the query results. To return pages over multiple HTTP requests, ACA-Py would need to keep the cursor in memory until it is accessed again (for up to 5 minutes, renewing on additional requests maybe). This is possible, but potentially not very reliable if the application is automatically scaling multiple container instances.

ff137 commented 3 months ago

fastapi-pagination is a popular library for implementing pagination within fastapi apps. It may offer a helpful reference for a protocol to follow.

The limit-offset pattern will respond with the current page number and the total number of pages. That way the client can increment the offset to get the next page. An ordering mechanism may be important as well

TimoGlastra commented 3 months ago

The Scan object returned over FFI is essentially a forward-only cursor on the query results

Would it be possible to create a new scan object with a certain offset instead of keeping the cursor in-memory? Or is there significant performance impact to that?


Also, would there be something that can be used as cursor rather than using offset in askar? I think for large datasets this performs better, and also means it solves records being added deleted, but it does need something sorteable

andrewwhitehead commented 3 months ago

Would it be possible to create a new scan object with a certain offset instead of keeping the cursor in-memory? Or is there significant performance impact to that?

Well, yes that's essentially what indy-sdk did, but it may produce duplicate records or miss records due to concurrent updates.

Also, would there be something that can be used as cursor rather than using offset in askar? I think for large datasets this performs better, and also means it solves records being added deleted, but it does need something sorteable

You could maybe put the results into a temporary table. I'm not sure if postgres offers a better option for snapshotting the results that would be accessible from a subsequent DB connection.

ff137 commented 1 month ago

Has anyone started looking at this, or been assigned to look at it?

If not, I'll see if I can make a contribution soon

esune commented 1 month ago

Has anyone started looking at this, or been assigned to look at it?

If not, I'll see if I can make a contribution soon

I don't believe we have anyone actively assigned to this task - if you have capacity to tackle it please do and thank you! 🙂

ff137 commented 1 month ago

After reviewing, there doesn't seem to be any changes necessary in askar itself, as pagination is already implemented in python binding as well.

I've created an issue on ACA-Py to track implementation there: https://github.com/hyperledger/aries-cloudagent-python/issues/3001