fullstorydev / hauser

Service for moving your Fullstory export files to a data warehouse
MIT License
49 stars 23 forks source link

Error uploading from GCS to BigQuery #47

Closed dbrodsky21 closed 5 years ago

dbrodsky21 commented 5 years ago
2019/01/11 11:26:00 Checking if table fullstory_data_export exists
2019/01/11 11:26:01 Export table exists, making sure the schema in BigQuery is compatible with the schema specified in Hauser
2019/01/11 11:26:01 Checking if table fullstory_sync_table exists
2019/01/11 11:26:03 Checking if table fullstory_data_export exists
2019/01/11 11:26:04 Checking for new export files since 0001-01-01 00:00:00 +0000 UTC
2019/01/11 11:26:05 Processing bundle 154189440048 (start: 2018-11-11 00:00:00 +0000 UTC, end: 2018-11-12 00:00:00 +0000 UTC)
2019/01/11 11:26:05 Getting Export Data for bundle 154189440048
2019/01/11 11:26:11 Loading GCS file: gs://<gcs_bucket>/154189440048.json into table fullstory_data_export$20181111
2019/01/11 11:26:11 Detected first bundle of the day (start: 2018-11-11 00:00:00 +0000 UTC), using WriteTruncate to replace any existing data in partition
2019/01/11 11:26:14 Job failed: {Location: "gs://<gcs_bucket>/154189440048.json"; Message: "Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details."; Reason: "invalid"}
2019/01/11 11:26:14 Error detail: {Location: "gs://<gcs_bucket>/154189440048.json"; Message: "Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details."; Reason: "invalid"}
2019/01/11 11:26:14 Error detail: {Location: "gs://<gcs_bucket>/154189440048.json"; Message: "Error while reading data, error message: Error detected while parsing row starting at position: 0. Error: Data between close double quote (\") and field separator."; Reason: "invalid"}
2019/01/11 11:26:14 Error detail: {Location: ""; Message: "You are loading data without specifying data format, data will be treated as CSV format by default. If this is not what you mean, please specify data format by --source_format."; Reason: "invalid"}
2019/01/11 11:26:14 Failed to load file '/tmp/154189440048.json' to warehouse: {Location: "gs://<gcs_bucket>/154189440048.json"; Message: "Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details."; Reason: "invalid"}

Though Hauser successfully creates the relevant tables in BigQuery and can upload the Data Export files to GCS (when GCSOnly = true), I'm getting the errors above when GCSOnly = False.

I'm not sure, but it seems like Error: Data between close double quote (\") and field separator. is most relevant here, but I'm having trouble debugging.

I believe I've appropriately configured all of the fields in the config file under [gcs] and [bigquery] as well as specifying Warehouse="bigquery"

Any insight would be much appreciated. Thanks!

dbrodsky21 commented 5 years ago

Issue was resolved by changing local.SaveAsJson to false in config file.