skardhamar / rga

R Google Analytics
186 stars 88 forks source link

Reproducible Error when No results with Dimension split #62

Open ebalp opened 9 years ago

ebalp commented 9 years ago

There is an issue when trying to get data with a segment that gives zero sessions and there is dimension splitting. It would be great to receive an empty dataframe instead of an error.

This gives no error because there are sessions from Boston:

seg <- 'sessions::condition::ga:city==Boston'
data <- ga$getData(ids, batch=TRUE, walk=TRUE,"2014-12-08","2014-12-09", 
                   metrics = "ga:sessions", dimensions = "ga:source", 
                   sort = "", filters = "", segment = seg)

But since there are no sessions from Timbuktu, the following:

seg <- 'sessions::condition::ga:city==Timbuktu'
data <- ga$getData(ids, batch=TRUE, walk=TRUE,"2014-12-08","2014-12-09", 
                   metrics = "ga:sessions", dimensions = "ga:source", 
                   sort = "", filters = "", segment = seg)

Gives this error:

Error in if (nrow(ga.data$rows) < ga.data$totalResults && (messages ||  : 
  missing value where TRUE/FALSE needed

Furthermore, when batch=TRUE is removed:

seg <- 'sessions::condition::ga:city==Timbuktu'
data <- ga$getData(ids, walk=TRUE,"2014-12-08","2014-12-09", 
                   metrics = "ga:sessions", dimensions = "ga:source", 
                   sort = "", filters = "", segment = seg) 

The error becomes:

Error in names(row) <- ga.headers$name : 
  'names' attribute [2] must be the same length as the vector [1]

Thank you very much for your help.

jdeboer commented 9 years ago

This situation is handled within the ganalytics package as seen in the source code starting at this reference: https://github.com/jdeboer/ganalytics/blob/master/R/GaListToDataframe.R#L28 A similar approach could be used here too. Essentially, to return an empty data.frame with named columns, this can be done by selecting none of the rows of a data.frame with named columns, i.e. my_data_frame[0,]

s6mike commented 9 years ago

I'm getting this error too. It would be great to have a fix to get an empty data frame as proposed.

Kusara commented 9 years ago

I think this can be fixed by replacing nrow(ga.data$rows) with NROW(ga.data$rows) on line 135.

if (nrow(ga.data$rows) < ga.data$totalResults && (messages || isBatch)) {
                if (!isBatch) {
                    message(paste("Only pulling", length(ga.data$rows), "observations of", ga.data$totalResults, "total (set batch = TRUE to get all observations)"))
                } else {
                    if (adjustMax) {
                        max <- ga.data$totalResults
                    }
                    message(paste("Pulling", max, "observations in batches of", batch))
                    # pass variables to batch-function
                    return(.self$getDataInBatches(total = ga.data$totalResults, max = max, batchSize = batch,
                                                  ids = ids, start.date = start.date, end.date = end.date, date.format = date.format,
                                                  metrics = metrics, dimensions = dimensions, sort = sort, filters = filters,
                                                  segment = segment, fields = fields, envir = envir))
                }

When RGA receives nothing from its call the ga.data$rows value does not exist, which returns NULL when nrow() is called on it. NROW() adjusts itself to handle that and returns 0, as I believe is intended.

The fix in https://github.com/skardhamar/rga/issues/52 is also necessary for this to work.

adaish commented 9 years ago

Hi, Has this been fixed? I have the same error- the GA search data is coming through until a specific data and then gives this error "Error in if (nrow(ga.data$rows) < ga.data$totalResults && (messages || : missing value where TRUE/FALSE needed ". Is it because the data is null even through I have the search results available on the GA dashboard. If you could let me know ASAP otherwise I will have to rebuild using the alternative GA and R packages. Many Thanks, Alice

Kusara commented 9 years ago

Hi Alice,

I ended up just modifying my instance of R. Here's a pastebin with the code: http://pastebin.com/q5n3ZUQH

If you download the source of rga, clobber core.R with what is in the pastebin, uninstall your current instance of rga and then run "install.packages(pathToModifiedRGA, repos = NULL, type="source")" you should be set.

Oh, you'll need to delete and re-instantiate your "ga" variable as well.

Cheers,

Alex

mattpolicastro commented 9 years ago

Kusara, are you comfortable submitting a pull request? Seems like a pretty straightforward fix. If not, let me know so I can test and do so on your behalf.

Kusara commented 9 years ago

Hi Mattpolicastro,

Never have before, but it's something I think would be cool to do. I'll give it a go and ping you if things go south.

Cheers,

Alex

adaish commented 9 years ago

Hi, Thanks for the fix but I now have this error which is still to do with the lack of results even through I can view the data in GA. <Error in ga$getData(ID, start.date = dat[i], end.date = dat[i], metrics = "ga:pageviews,ga:uniquePageviews,ga:avgTimeOnPage,ga:entrances,ga:bounceRate,ga:exitRate", : no results: 0> Any ideas for how to fix this bug?

Kusara commented 9 years ago

Hi again adaish,

A hack to get it to complete the query is to set rbr=TRUE in your call to ga$getData(). This will give you rows with all NA values where the request to GA came back with no results. If you're doing the query with batch=TRUE and walk=TRUE you can usually interpolate the missing dates and fill in all the metrics with zero.

The rbr=TRUE trick is taking advantage of this part of the code, starting at line 158:

            # did not return any results
            if (!inherits(ga.data$rows, "matrix") && !rbr) {
                stop(paste("no results:", ga.data$totalResults))
            } else if (!inherits(ga.data$rows, "matrix") && rbr) {
                # return data.frame with NA, if row-by-row setting is true
                row <- as.data.frame(matrix(NA, ncol = length(ga.headers$name), nrow = 1))
        colnames(row) <- ga.headers$name
        return(row)
            }

Cheers,

Alex