argasi / google-bigquery

Automatically exported from code.google.com/p/google-bigquery
0 stars 0 forks source link

Streaming table details missing (size + number of rows) #209

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Streaming ingestion is a fantastic feature, but when I look at the table 
details of a streaming table, the 'Table Size' and 'Number of Rows' items are 
always missing. It would be nice to have those available. 

I found this issue on SO [1] which talks about it becoming available after the 
buffer is flushed, but we continuously stream to the table, so maybe that's why 
they never appear? 

[1]: 
http://stackoverflow.com/questions/23635364/streamed-tables-dont-display-table-s
ize-in-ui

Original issue reported on code.google.com by tij...@firigames.com on 13 Jan 2015 at 12:55

GoogleCodeExporter commented 9 years ago
This is a known behavior, and was originally intentional --

Streamed data ingestion doesn't modify the table's modified time until a chunk 
of data is batched together and flushed. As such since a table with streaming 
data has size/row counts that are likely constantly changing, we omitted table 
size/row calculation to avoid misrepresenting the state of the table, and "more 
accurately" represent an unknown state.

Internally we've discussed this a bit and agree it's somewhat annoying... We'll 
likely modify the API in such a way as to either estimate the buffered data 
size at the time of the call, or explicitly split out the known and unknown 
portions of that data.

Original comment by seanc...@google.com on 13 Jan 2015 at 5:30

GoogleCodeExporter commented 9 years ago
In terms of documentation, the Table Resource documentation here indicates this 
field is unavailable for tables being streamed to:

https://cloud.google.com/bigquery/docs/reference/v2/tables#resource

Currently, "actively" implies "within the last day".

Original comment by seanc...@google.com on 13 Jan 2015 at 5:32

GoogleCodeExporter commented 9 years ago
Right, I figured it wasn't available because the table is being streamed to.

For our purpose, at 'not so accurate' number is fine (maybe accompanied with a 
timestamp). It is just to get some ballpark feel of how large a table is. Even 
a daily updated number would be fine for us. We stream data constantly so 
whichever way the values are computed it is not accurate for more than a few 
seconds anyway. 

Original comment by tij...@firigames.com on 13 Jan 2015 at 5:40