Closed RhysJackson closed 6 years ago
Not as its implemented now, but I suppose it makes sense to override it - but I think this is probably more convenient if all you needed to do was use R's Date
types to make it BigQuery DATE
, and leave R's POSIXct
as TIMESTAMP
Would mean changing this https://github.com/cloudyr/bigQueryR/blob/af38d0e8de01e7c927e06a904f5a5dfd5a82ddb2/R/uploadData.R#L313-L326 to the below:
## From bigrquery
data_type <- function(x) {
switch(class(x)[1],
character = "STRING",
logical = "BOOLEAN",
numeric = "FLOAT",
integer = "INTEGER",
factor = "STRING",
Date = "DATE", # <- changed this to DATE
POSIXct = "TIMESTAMP",
hms = "INTEGER",
difftime = "INTEGER",
stop("Unknown class ", paste0(class(x), collapse = "/"))
)
}
Gave it a go, try it with the GitHub version. This may break existing uploads though for people.
...so to support that you can also now supply your own schema. schema_fields(upload_data)
is exported so you can modify that result if required.
Wow, that was quick! Thanks Mark.
With the first change I'm getting a different, but related error: Error while reading data, error message: Could not parse '1515628800' as date for field Date
Are the date fields getting coerced into POSIXct somewhere along the line?
I'll try the modified schema_fields()
approach in just a moment.
Ah ok, I just saw this code:
standard_csv <- function(values) {
...
is_date <- vapply(values, function(x) inherits(x, "Date"), logical(1))
values[is_date] <- lapply(values[is_date], function(x) as.numeric(as.POSIXct(x)))
...
}
so looks like dates are changed to numerics in that line. I took that out, try it now.
Amazing, that's done the trick.
Thank you so much. Strangely assert_that()
couldn't be found, despite being a dependency. But explicitly loading it fixed the problem.
https://github.com/cloudyr/bigQueryR/blob/af38d0e8de01e7c927e06a904f5a5dfd5a82ddb2/R/uploadData.R#L155
Example:
Explanation:
I'm unable to append data from a non-gs:// source to an existing BigQuery table due to a mismatch between the existing schema, and the schema produced by
schema_fields()
.Specifically, Date values are being interpreted as Timestamps in the schema being passed to BigQuery.
Is there any way of passing custom schema to
bqr_upload_data()
for non cloud storage data sources?