m-lab / etl-gardener

Gardener provides services for maintaining and reprocessing mlab data.
Apache License 2.0
13 stars 5 forks source link

Strange BigQuery behavior #276

Open gfr10598 opened 4 years ago

gfr10598 commented 4 years ago

Several strange behaviors emerged while developing the post processing steps.

  1. bqiface Table.CopierFrom did not seem to actually copy the partition. Worked around this by using bigquery.Table.CopierFrom, but that causes headaches with injection.
  2. The successful copy from tmp_ndt to raw_ndt appears to result in the correct data in raw_ndt, BUT the ParseInfo.ParseTime values don't match the values in the tmp_ndt partitions. They appear to match the values from the PREVIOUS cycle instead.
  3. Queries against tmp_ndt continue to return the rows from the recently deleted partitions. CLI bq show command shows metadata indicating that the partitions are empty. [This began behaving correctly for ndt7 after a few days, on May 12, but continued to behave incorrectly for tcpinfo]
  4. New rows from the parser (with recent ParseTime) don't seem to show up in queries against the active partitions in tmp_ndt.
laiyi-ohlsen commented 4 years ago

@gfr10598 Is this still happening?