googledatalab / datalab

Interactive tools and developer experiences for Big Data on Google Cloud Platform.
Apache License 2.0
974 stars 249 forks source link

BigQuery fields of type RECORD not supported #2080

Open TAPeri opened 6 years ago

TAPeri commented 6 years ago

When I run a query that contains a a column of type STRING and a column of type RECORD column, only the STRING column shows up in the results.

%bq query
SELECT fullvisitorid, totals
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20160830` 
LIMIT 10

When I export the result to CSV, the whole record is saved as a string (with json encoding) below the name of the record, but then additional empty columns are added to the csv for each of the record keys.

fullvisitorid,totals,totals.pageviews,totals.sessions
895954260133011192,"{u'pageviews': 17, u'sessions': 1}",,
413376117696916627,"{u'pageviews': 23, u'sessions': 1}",,
paveldournov commented 6 years ago

trying a workaround SELECT fullvisitorid, totals.pageviews, totals.sessions FROM bigquery-public-data.google_analytics_sample.ga_sessions_20160830 LIMIT 10

TAPeri commented 6 years ago

Thanks for the workaround. I think that is good once we have reduced the number of features, but it is too cumbersome for initial exploratory analysis, as the original google analytics export has several more record fields, each with 10+ subfields