Open sunahsuh opened 5 years ago
I've investigated and it looks like both BigQuery (Alpha)
and BigQuery (Beta)
are actually referencing the same views and there are other views in the same data source that have schemas visible. So as you said, @sunahsuh, it seems this is unexpected and is probably unrelated to anything being a view.
However, it seems there are timeouts occurring when the schema processing task is running and so some table information seems to never get updated/stored. Probably this data source just has a lot of tables and it doesn't make it in time to process schemas for all of them.
Will need further investigation to see whether it's a small difference in timeout and we can just increase it or if the processing needs to be broken up further in some way.
A link back to the same issue filed elsewhere: https://bugzilla.mozilla.org/show_bug.cgi?id=1584036
Another update:
We ran the schema processing function manually, without a timeout. It took ~20min (the timeout is 10min). This resolved the issue for the time being, telemetry.voice
and the other tables should all have visible schemas now.
Essentially what happened was that recently many tables were removed/added within a short time frame so when redash was processing the schema changes, it needed more time to prune/remove the old tables and add new ones. This was timing out with every run and schema updates were never happening.
We will need to decide whether we just increase the timeout to accommodate such scenarios and/or rewrite the update function to be more efficient depending on how frequently big schema changes like this are likely to occur.
Awesome, thanks for looking into this @emtwo! I'm good to close this issue since the immediate issue is fixed, but if you want to keep this open to track the decision for a long-term solution that's okay with me
I noticed this while trying to look at the schema for
telemetry.voice
under the "BigQuery (Beta)" source, but I see the same forcrash
,event
, andmain
, which are all direct ping tables (but interestingly, notdowngrade
,first_shutdown
,voice_feedback
, which are also direct ping tables.) @emtwo suggested we might have issues with fetching schemas for views, but from the looks of bigquery's schema browser it looks like nearly all of the tables inmoz-fx-data-derived-datasets:telemetry
are views.More curiousness that I just noticed: the voice schema is visible in the "BigQuery (Alpha)" source.