SDITools / adobeanalyticsr

R Client for Adobe Analytics API v2.0
Other
18 stars 9 forks source link

Add a way to pull all available rows #122

Open benrwoodard opened 2 years ago

benrwoodard commented 2 years ago

The limit is 50k rows from the API. The goal of this enhancement would be to add an "all" value as an option in the 'top' argument so that all the results will be returned. A possible solution is to add an option similar to what we did for dateranges using "0" to pull all dates or hours. If a user adds "all" as the last "top" argument value the last API call will pull the number of pages needed to pull all the rows and then loop through the pages compiling the final dataset. It may also work to simply have an "all" argument set to TRUE or FALSE. Then the last API call would loop through the pages. Theoretically, this is only viable if it is the last call since there would be no way to pull 50k+ rows and then do additional API calls on that.

charlie-gallagher commented 2 years ago

I was just playing with the API, and I think this would be almost straightforward. We would set a condition that, if top is "all", then continue querying until the response contains "lastPage=true". We wouldn't be able to predict how long it would take, of course, but we could definitely do it. I'll see if I can add that logic to the query function. Hopefully everything's modularized enough that it's just changing one function a little bit

benrwoodard commented 2 years ago

Couldn't we use the number of pages and limit of the first response of the last series of api calls to estimate?

charlie-gallagher commented 2 years ago

I wouldn't leave out the possibility of giving incremental messages, but we wouldn't be able to say up front

benrwoodard commented 2 years ago

I don't think my message was clear. In the response to the API call, we have "totalElements" defined for us. So we theoretically could send up a request with a limit of 1 and get the "totalElements" then do the simple math of defining how many pages we would need given the 50k row limit and then build our final request from there, right? image

charlie-gallagher commented 2 years ago

Gotcha, I see what you're saying now. I think there will be a big difference between the total number of dimension values (which we'll use to calculate the estimate) and the max number of dimension values for a given combination of dimension levels.

E.g., there might be 500,000 page paths, but only 1 page path for a given combination of dimension levels in your breakdown