aerosense-ai / data-gateway

Data influx for Aerosense.
https://www.aerosense.ai/
Other
3 stars 1 forks source link

Reduce default upload window size #53

Closed cortadocodes closed 2 years ago

cortadocodes commented 2 years ago

Consider what the optimal upload window size / window file size is

thclark commented 2 years ago

Well, it's not 10 minutes!

I get the following 413 error when inserting 10-minute windows into the DB, so I'm going to try 60s batches and will feed back on what the best value looks like.

"/layers/google.python.pip/pip/lib/python3.9/site-packages/flask/app.py", line 1518, in full_dispatch_request rv = self.handle_user_exception(e) File "/layers/google.python.pip/pip/lib/python3.9/site-packages/flask/app.py", line 1516, in full_dispatch_request rv = self.dispatch_request() File "/layers/google.python.pip/pip/lib/python3.9/site-packages/flask/app.py", line 1502, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args) File "/layers/google.python.pip/pip/lib/python3.9/site-packages/functions_framework/__init__.py", line 171, in view_func function(data, context) File "/workspace/main.py", line 38, in upload_window window_handler.persist_window(window, window_metadata) File "/workspace/window_handler.py", line 91, in persist_window self.dataset.add_sensor_data( File "/workspace/big_query.py", line 99, in add_sensor_data errors = self.client.insert_rows(table=self.client.get_table(self.table_names["sensor_data"]), rows=rows) File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/cloud/bigquery/client.py", line 3466, in insert_rows return self.insert_rows_json(table, json_rows, **kwargs) File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/cloud/bigquery/client.py", line 3646, in insert_rows_json response = self._call_api( File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/cloud/bigquery/client.py", line 782, in _call_api return call() File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/api_core/retry.py", line 283, in retry_wrapped_func return retry_target( File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/api_core/retry.py", line 190, in retry_target return target() File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/cloud/_http/__init__.py", line 494, in api_request raise exceptions.from_http_response(response) google.api_core.exceptions.GoogleAPICallError: 413 POST https://bigquery.googleapis.com/bigquery/v2/projects/aerosense-twined/datasets/greta/tables/sensor_data/insertAll?prettyPrint=false: <!DOCTYPE html>
Traceback (most recent call last): File "/layers/google.python.pip/pip/lib/python3.9/site-packages/flask/app.py", line 2073, in wsgi_app response = self.full_dispatch_request() File "/layers/google.python.pip/pip/lib/python3.9/site-packages/flask/app.py", line 1518, in full_dispatch_request rv = self.handle_user_exception(e) File "/layers/google.python.pip/pip/lib/python3.9/site-packages/flask/app.py", line 1516, in full_dispatch_request rv = self.dispatch_request() File "/layers/google.python.pip/pip/lib/python3.9/site-packages/flask/app.py", line 1502, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args) File "/layers/google.python.pip/pip/lib/python3.9/site-packages/functions_framework/__init__.py", line 171, in view_func function(data, context) File "/workspace/main.py", line 38, in upload_window window_handler.persist_window(window, window_metadata) File "/workspace/window_handler.py", line 91, in persist_window self.dataset.add_sensor_data( File "/workspace/big_query.py", line 99, in add_sensor_data errors = self.client.insert_rows(table=self.client.get_table(self.table_names["sensor_data"]), rows=rows) File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/cloud/bigquery/client.py", line 3466, in insert_rows return self.insert_rows_json(table, json_rows, **kwargs) File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/cloud/bigquery/client.py", line 3646, in insert_rows_json response = self._call_api( File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/cloud/bigquery/client.py", line 782, in _call_api return call() File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/api_core/retry.py", line 283, in retry_wrapped_func return retry_target( File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/api_core/retry.py", line 190, in retry_target return target() File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/cloud/_http/__init__.py", line 494, in api_request raise exceptions.from_http_response(response) google.api_core.exceptions.GoogleAPICallError: 413 POST https://bigquery.googleapis.com/bigquery/v2/projects/aerosense-twined/datasets/greta/tables/sensor_data/insertAll?prettyPrint=false: <!DOCTYPE html>
thclark commented 2 years ago

I've reduced this to 60s in #96 and will continue to monitor.

thclark commented 2 years ago

Seems good at 60s (successfully uploaded mics data for 60s and baros data covering similar period) so I'm closing this as done.