SDITools / adobeanalyticsr

R Client for Adobe Analytics API v2.0
Other
18 stars 9 forks source link

"daterangeday" as first/only dimension only returning the top 5 #141

Open gilliganondata opened 2 years ago

gilliganondata commented 2 years ago

It seems like the expected "return all" when the first dimension is a date is not working under some conditions.

When I run the following:

df <- aw_freeform_table(company_id, rsid,
                        date_range = c("2020-05-01", "2022-04-30"),
                        dimensions = "daterangeday",
                        metrics = "visits")

I get the following message/results:

Requesting data...
Done!
Returning 5 x 2 data frame

It appears that it's using the actual default of 5 for top, even though the top dimension is a daterange... one.

It was easy enough to work around by adding top=1000, but the behavior runs counter to the documentation.

This is R v4.1.3 with adobeanalyticsr v0.3.2.

benrwoodard commented 2 years ago

Using a 0 value to the top argument, top = 0, will get you the number of days expected. It definitely would make sense to automatically interpret 'daterangeday' as the number of days in the date range. I think I just stopped short of doing that due to the assumed need for someone to define a specific number of days if it was used in a series of dimensions. Maybe due to using an attribution model defined metric?

charlie-gallagher commented 2 years ago

Ach yeah current behavior doesn't match the documentation. Must've slipped through the unit tests. @benrwoodard this happens in make_explicit_top, which also has unit tests we can update.

On a user-interface note, I agree with Ben and probably would take this a bit further. I don't see why we should make it impossible to let users return the top 5 minutes or hours. I think this would be easier to use and implement if we just had two rules:

Then there are the normal recycling rules, which most R users are familiar with.

gilliganondata commented 2 years ago

Okay. Yeah. I thought I'd tested this way back with the initial release and it behaved as its documented. But, it seems totally fine to make it a documentation bug—adding top = 0 is easy enough to use in code.

Interestingly, of course, if you don't specify (so it's top=5) or if you do specify a value that is less than the total date range, it's going to return the "top X days based on the first metric." That's actually a potentially (?) useful application: "I'd like to know what the top 10 days for revenue over the past year were"... a top=10 with daterangeday is going to return that!