get_multi & co. assume/force pagination

slaarti commented 2 months ago

Is your feature request related to a problem? Please describe.

I understand that the ability to do pagination with get_multi etc. is considered a feature, but I'd like to be able to get all rows without dealing with that, and there's no simple way to do it.

Describe the solution you'd like

I'd like to be able to specify that I don't want to set a limit on the number of rows to fetch. (Actually, ideally unlimited/unpaginated should've probably been the default, IMO, but I understand that doing that now would be a breaking change since the current default is 100, so I'll settle for it being on by default and being able to turn it off.)

Describe alternatives you've considered

The first thing I tried was setting limit to 0, since I'd noticed that the code checks for limit < 0 as an error but not limit == 0. Instead, what I got was no rows whatsoever, which I suppose is technically what one should maybe expect from saying you want at most zero rows, but also feels useless enough to not be intentional.

I've been able to work around the situation by using count to get the total number of rows and setting limit to the result, but that feels like a gruesome hack that I'd rather avoid.

Additional context

I'm willing to work on a PR for this, and am filing this issue at least in part to see a. whether this would even be welcomed, and b. whether you'd prefer:

Turning off pagination in the existing methods by setting limit=0
Turning off pagination in the existing methods by setting limit=None
Splitting off and having a separate set of methods for unpaginated multi-row gets (regular and joined).

(I suppose an additional benefit of option 3 would be that the methods would just return the rows instead of wrapping them in a dict that the ["data"] needs to get dug out of.)

igorbenav commented 2 months ago

I think this might be a good addition, I thought about doing it in the existing get_multi and get_multi_joined (something like passing -1 retrieves all data), but this is not a good option since when you create endpoints automatically you don't want the user to be able to just retrieve all of your data. I guess for regular (specially public) APIs this wouldn't happen a lot, but maybe for some other specific use case.

I don't know, would like more opinions on this.

slaarti commented 2 months ago

Yeah, I get that. Meanwhile, I'll note that my own uses are entirely in private projects I run only on localhost specifically to get aggregate views across sets of scraped data, in many cases because of pagination. 😄 Anyway, I ended up agreeing with the idea that the existing methods should probably remain as-is.

To be honest, the resulting get_multi_all, get_multi_joined_all, and tests for both of same are basically copy-pastes of the existing methods with offset, limit, and return_total_count removed, the return value simplified into just a list of result rows/objects (depending on return_as_model as with the existing methods), and typing/docstrings adjusted accordingly. That's a lot less DRY than I necessarily like, but I figured that was an optimization that could be done once the functionality was in place where more people than me could see it.

PR incoming momentarily.

igorbenav commented 2 months ago

I guess it'd actually be ok if it was in the same function, but we change the endpointcreator to raise an error when requesting the unlimited version (like -1 or None).

Less repetition, still safe for endpoints. A warning should be added in the docstring (if you are using this function in an endpoint, sanitize the query parameters before passing it to get_multi to make sure the user can't get all of the data)

slaarti commented 2 months ago

Ah, yeah, I'd meant to mention in my previous comment that with separate methods, it'd be easy enough to just not include them in automated endpoint creation, so making them available is an active choice by the app developer who presumably knows what they're making available.

But as you say, changing the existing methods and adding guards on the endpoint creator would work too (I'd personally then want options to squelch the error, but even without that, I could just write my own "unlimited get" endpoint code). I'm not particularly precious as to the specific way it happens, so long as the functionality is there. 😃

igorbenav commented 2 months ago

Closed by #62

igorbenav / fastcrud

get_multi & co. assume/force pagination #59