artemklevtsov / RGA

A Google Analytics API client for R
http://cran.r-project.org/package=RGA
32 stars 13 forks source link

Some metrics are incorrect using fetch.by #48

Closed acvelozo closed 6 years ago

acvelozo commented 7 years ago

Hi Artem,

Fist of all, thank you for this great package. It's a really huge time saver for my work.

I have encountered an issue with some metrics obtained using get_ga() and fetch.by. My code is the following:

AnalyticsCoreData <- get_ga(profileId =XXXXXXXXX,
start.date = "2016-11-01",
end.date   = "2016-11-30",
dimensions="ga:adPlacementDomain",
metrics=c("ga:sessions","ga:percentNewSessions","ga:bounceRate","ga:avgSessionDuration"),
fetch.by = "week")

I am using the latest version available from github. Using the parameter fetch.by allows me to avoid sampling. If I use "week" or "day", no sampling occurs. However, many of the values obtained in this way for percentNewSessions and bounceRate are superior to 100%, which is not possible.

In addition, although the result for sesssions is the same in both cases (using "week" or "day"), I obtain different results for the other 3 metrics.

It seems that fetch.by splits the query into several smaller queries and then adds up the data to get to the overall date range. However, for metrics such as percentNewSessions, bounceRate and avgSessionDuration, a sum would provide an inaccurate result.

Thank you and best regards

artemklevtsov commented 7 years ago

It seems that fetch.by splits the query into several smaller queries and then adds up the data to get to the overall date range.

You're right. When results collected we simply sum of the metrics.

acvelozo commented 7 years ago

Thank you for the useful clarification.

I have found a way to avoid this problem in my case.

Basically, instead of getting bounceRate, I can get bounces and then do myself the calculation of the bounceRate = bounces / sessions. Sum is fine for both bounces and sessions.

Using the same logic I can also get the other metrics percentNewSessions and avgSessionDuration.

Should I close the issue?