Open time-trader opened 2 years ago
60 minutes is quite long for a window. Can you stick to 10 minutes? If so then problem solved ;)
-- Sent from a mobile device; please forgive abrupt statements and 'auto-fill' mistakes!
Tom Clark 0776 453 0870
On 21 Feb 2022, at 23:31, time-trader @.***> wrote:
Bug report
What is the current behavior?
Executing cloud function with windows of window_size=60 and with Diff_Baros or Baros Sensor Data fails with following error:
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 2073, in wsgi_app response = self.full_dispatch_request() File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1518, in full_dispatch_request rv = self.handle_user_exception(e) File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1516, in full_dispatch_request rv = self.dispatch_request() File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1502, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(req.view_args) File "/layers/google.python.pip/pip/lib/python3.8/site-packages/functions_framework/init.py", line 171, in view_func function(data, context) File "/workspace/main.py", line 41, in upload_window window_handler.persist_window(unix_timestamped_window["sensor_data"], window_metadata) File "/workspace/window_handler.py", line 99, in persist_window self.dataset.add_sensor_data( File "/workspace/big_query.py", line 80, in add_sensor_data errors = self.client.insert_rows(table=self.client.get_table(self.table_names["sensor_data"]), rows=rows) File "/layers/google.python.pip/pip/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 3411, in insert_rows return self.insert_rows_json(table, json_rows, kwargs) File "/layers/google.python.pip/pip/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 3590, in insert_rows_json response = self._call_api( File "/layers/google.python.pip/pip/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 760, in _call_api return call() File "/layers/google.python.pip/pip/lib/python3.8/site-packages/google/api_core/retry.py", line 283, in retry_wrapped_func return retry_target( File "/layers/google.python.pip/pip/lib/python3.8/site-packages/google/api_core/retry.py", line 190, in retry_target return target() File "/layers/google.python.pip/pip/lib/python3.8/site-packages/google/cloud/_http/init.py", line 480, in api_request raise exceptions.from_http_response(response) google.api_core.exceptions.GoogleAPICallError: 413 POST https://bigquery.googleapis.com/bigquery/v2/projects/aerosense-twined/datasets/greta/tables/sensor_data/insertAll?prettyPrint=false: <!DOCTYPE html> ```
This might be related to JSON file being too big? However checking the ingress bucket shows that all files are below 20 MB — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.
Its 60 seconds
This might be related to JSON file being too big? However checking the ingress bucket shows that all files are below 20 MB
This refers to Dataflow rather than BigQuery but a 413
does suggest a payload that's too big. Do you have an idea of how big each batch is?
The size of json window files in the ingress bucket: aerosense-ingress-eu/data_gateway/790791 is anywhere between 1 Kb and 12 MB, I don't know if the payload is the same size...
With more testing: window_size = 60s, the error occurs either for diffBaros or for Baros sensor data. window_size = 30s, this error happens only with diffBaros sensor data, but NOT with Baros. NOTE: Batches with diffBaros data are actually smaller than ones with Baros. SO I assume the error has to do more with frequency (diffBaros are sampled much more frequently) and how the batch is processed and inserted into BigQuerry. I noticed that when I export BigQuerry request I get JSON lines, with lots of repeated data like sensor name etc, for every sample...
With window_size = 20s, This error does not occur.
Bug report
What is the current behavior?
Executing cloud function with windows of window_size=60 (sec) and with Diff_Baros or Baros Sensor Data fails with following error:
This might be related to JSON file being too big? However checking the ingress bucket shows that all files are below 20 MB