fullstorydev / hauser

Service for moving your Fullstory export files to a data warehouse
MIT License
49 stars 23 forks source link

Download times out while decoding JSON #82

Open PaulSchnau opened 4 years ago

PaulSchnau commented 4 years ago

I am getting repeated error messages like:

16:20:34 Processing bundle 156453120048 (start: 2019-07-31 00:00:00 +0000 UTC, end: 2019-08-01 00:00:00 +0000 UTC)
16:20:34 Getting Export Data for bundle 156453120048
16:30:36 failed json decode of record: unexpected EOF
16:30:43 Pausing; will retry operation in 30s
16:31:13 Checking if table fs_sync exists
16:31:14 Checking if table fs_export exists
16:31:15 Checking for new export files since 2019-07-31 00:00:00 +0000 UTC
16:31:15 Processing bundle 156453120048 (start: 2019-07-31 00:00:00 +0000 UTC, end: 2019-08-01 00:00:00 +0000 UTC)
...

You can see the time from Getting Export Data to failed json decode is about 10 minutes, which is consistent through many loops of this error. I think this is a timeout for DataExport downloads from the REST api. Our larger export files reach many GB in size, so we're hitting this problem running hauser on our slower VMs.

I tried manually skipping a couple DataExport files to see if it was only one specific file having the problem, but I hit the same problem for the next two days of DataExport files.

My config.toml

FsApiToken = "redacted"
Backoff = "30s"
BackoffStepsMax = 8
CheckInterval = "30m"
TmpDir = "/tmp"
Warehouse="bigquery"
GroupFilesByDay = false
SaveAsJson = false

[gcs]
Bucket = "redacted"
GCSOnly = false

[bigquery]
Project = "redacted"
Dataset = "redacted"
ExportTable = "fs_export"
SyncTable = "fs_sync"
patrick-fs commented 4 years ago

Hi @PaulSchnau thanks for lettings us know about this. If you haven't already, would you please contact support@fullstory.com as well?

One thing you could try is increasing the frequency at which data export files are generated. This will create smaller files.

In Settings > Integrations & API Keys:

image

PaulSchnau commented 4 years ago

Thanks @patrick-fs , decreasing the time range fixes this for me.