anelendata / tap-bigquery

Singer.io tap for extracting data from BigQuery tables
Apache License 2.0
16 stars 32 forks source link

`Cannot access field key on a value with type ARRAY` when trying to select (record, repeated) fields #25

Open astrojuanlu opened 2 years ago

astrojuanlu commented 2 years ago

Trying to extract a table with (record, repeated) fields, coming from Google Analytics 4:

image

using this config:

{
  "streams": [
    {
      "name": "analytics_257428027",
      "table": "orchest-ga4-data.analytics_257428027.events_20220810",
      "columns": [
        "event_timestamp",
        "event_name",
        "event_params.key",
        "event_params.value.string_value",
        "event_params.value.int_value",
        "traffic_source.name",
        "traffic_source.medium",
        "traffic_source.source"
      ],
      "datetime_key": "event_date"
    }
  ],
  "credentials_path": "/project-dir/meltano/client_secrets.json",
  "start_datetime": "2022-08-10T00:00:00Z",
  "start_always_inclusive": true
}

I'm getting this error:

 INFO Running query:
    SELECT event_timestamp,event_name,event_params.key,event_params.value.string_value,event_params.value.int_value,traffic_source.name,traffic_source.medium,traffic_source.source,event_date FROM orchest-ga4-data.analytics_257428027.events_20220810 WHERE 1=1 AND datetime '2022-08-10 00:00:00.000000' <= CAST(event_date as datetime) AND CAST(event_date as datetime) < datetime '2022-08-11 10:00:20.205737' ORDER BY event_date LIMIT 100
CRITICAL 400 Cannot access field key on a value with type ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, ...>>> at [1:48]
astrojuanlu commented 2 years ago

I believe setting the columns to * or even ['event_params'] (the name of the offending field, without any sub fields) does the trick. However, I cannot verify it at the moment because I'm being struck by gh-17, so I'm creating a view to see if it works.