skardhamar / rga

R Google Analytics
186 stars 89 forks source link

Cant pull data for more than 10k a day even with batch #46

Open maxim-uvarov opened 10 years ago

maxim-uvarov commented 10 years ago

I reinstalled R from scratch

R version 3.1.0 (2014-04-10) -- "Spring Dance" Copyright (C) 2014 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit)

I have script that worked before:

ga.data = ga$getData("ga:1111111", batch = TRUE, walk = TRUE, "2012-28-01", "2012-12-31", metrics = "ga:sessions,ga:transactions,ga:goal4completions,ga:pageviews,ga:bounces,ga:sessionDuration", dimensions = "ga:keyword,ga:sourceMedium,ga:date,ga:region,ga:landingPagePath,ga:adContent,ga:adMatchedQuery", sort = "", filters = "ga:browser=~(Chrome|Firefox|Internet Explorer|Opera|Safari|YaBrowser)", segment = "")

And now it pulls only first 10k of data and doesn't use batch:(

no batch

kingo55 commented 10 years ago

I'm also having troubles with this, when in the past I didn't. I'm going to roll back to the fork @willempaling has developed since there must be something in the new commits that's broken here.

artemklevtsov commented 10 years ago

Hi,

Can't confirm this:

> ga_df <- ga$getData(ids = id, start.date = "2014-03-09", end.date = "today", metrics = "ga:sessions,ga:transactions,ga:goal4completions,ga:pageviews,ga:bounces,ga:sessionDuration",
+ dimensions = "ga:keyword,ga:sourceMedium,ga:date,ga:region,ga:landingPagePath,ga:adContent,ga:adMatchedQuery", filters = "ga:browser=~(Chrome|Firefox|Internet Explorer|Opera|Safari|YaBrowser)", batch = TRUE, walk = TRUE)
> nrow(ga_df)
[1] 366999

I used 79dd787 commit.

maxim-uvarov commented 10 years ago

What Information I should provide?

lunametrics commented 10 years ago

Hi unikum, I also used 79dd787 commit, and your code example. Getting nrow(ga_df) = 10000

R version: [64-bit] 3.1.0

@willempaling fork didn't change the result

maxim-uvarov commented 10 years ago

This is success!!! Lunametrics is in my thread on github! Dreams become real! :)

lunametrics commented 10 years ago

Glad to help your dreams come true, 40-02 ;)

Update; batch now working for me. I think the only change is that I requested a new OAuth token. Very strange.

I'll comment again if I notice any rhyme or reason for batch feature sometimes not working. LOVE the package by the way. Planing to do a blog post with some example report templates.

maxim-uvarov commented 10 years ago

and as for me the problem is still here:

MarkEdmondson1234 commented 10 years ago

I'm getting this too, but for different calls with the same authentication.

This will only fetch 10000 rows: (13000 results)

ga_data_x <- ga$getData(ids="xxxxxx",
                        start.date = "2013-01-01",
                        end.date   = yesterday,
                        metrics    = "ga:sessions",
                        dimensions = "ga:date,ga:country",
                        filters    = "", batch=T)

But this in the same script (e.g. using same authentication) works fine and fetches 92925 rows:

ga_data_allVisits <- ga$getData(ids=UA,
                        start.date = "2013-01-01",
                        end.date   = yesterday,
                        metrics    = "ga:sessions,ga:goal2Completions,ga:goal11Completions,ga:goal12Completions,ga:goal12ConversionRate",
                        dimensions = "ga:date,ga:country",
                        filters    = "",
                        batch=TRUE)
Kusara commented 9 years ago

I think this bug has to do with

            if (length(ga.data$rows) < ga.data$totalResults && (messages || isBatch)) {

on line 135 of core.R.

ga.data$rows shows up as a matrix in the pull that generates this batch-size error for me. When the length of the entire matrix is larger than ga.data$totalResults it will skip over running the $getDataInBatches call on line 144.

A workaround is to pull small numbers of metrics at once, like one or two. I think a fix would be nrow() instead of length() if ga.data$rows is always a matrix. Or perhaps NROW() if it sometimes is a vector or list.

Seems like the fix requires

} else {
                    adjustMax <- FALSE
                }

on line 41 to be

} else {
                    adjustMax <- TRUE
                }

as well.

Going to test it some more.