Closed daikman closed 7 months ago
The new batched wrapper function around get-resource looks great! Couple of initial thoughts:
Agree with Csilla's comments.
I would suggest calling the offset
parameter skip
, skip_n
, skip_rows
or something like that as I think that is more intuitive. The offset name makes most sense only in the context of the batch function.
I wonder if you could run some tests for speed/number of timeouts to determine what is a good default size. This could then also be used 'baked-in' to the get_resource
function so for example when requesting > 10, 000 rows it reverts to the batch function with n_rows = 2, 000.
Closing this as using dump endpoint over rows >99999 achieves sufficient efficiency (can revisit in future if needed).
I've added an "offset" argument to
get_resource()
to enable batch downloading. I also createdget_resource_batched()
to wrap aroundget_resource()
to easily download a resource in batches.I haven't written tests for the new function yet, but
get_resource()
still passes its tests despite the changes made to it.