SolarArbiter / solarforecastarbiter-api

HTTP API and database schema for the Solar Forecast Arbiter
https://api.solarforecastarbiter.org
MIT License
10 stars 6 forks source link

mysql fails on large insertions #242

Closed sentry-io[bot] closed 4 years ago

sentry-io[bot] commented 4 years ago

We should update max_allowed_packet to reflect whatever maximum size we choose and implement some checking. @wholmgren suggested 100MB in https://github.com/SolarArbiter/solarforecastarbiter-dashboard/issues/254 . We should also return an error message that gives the hard limit on file size and a suggestion of approximately how many datapoints would fit in a file of that size. This might vary between json/csv so maybe leaving out the latter would suffice but the primary issue appears to be the size of the data after we've converted it to format for sql insertion. So, we should base our estimates on the size of the data as passed to the procedures. We'll also need to verify that timeouts are adjusted to properly to allow time for inserting larger datasets.

Sentry Issue: API-47

InternalError: (1153, "Got a packet bigger than 'max_allowed_packet' bytes")
(6 additional frame(s) were not displayed)
...
  File "pymysql/connections.py", line 732, in _read_query_result
    result.read()
  File "pymysql/connections.py", line 1075, in read
    first_packet = self.connection._read_packet()
  File "pymysql/connections.py", line 684, in _read_packet
    packet.check_error()
  File "pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)

BrokenPipeError: [Errno 32] Broken pipe
(1 additional frame(s) were not displayed)
...
  File "gevent/_ssl3.py", line 510, in sendall
    return socket.sendall(self, data, flags)
  File "gevent/_socket3.py", line 527, in sendall
    return _socketcommon._sendall(self, data_memory, flags)
  File "gevent/_socketcommon.py", line 380, in _sendall
    timeleft = __send_chunk(socket, chunk, flags, timeleft, end)
  File "gevent/_socketcommon.py", line 309, in __send_chunk
    data_sent += socket.send(chunk, flags)
  File "gevent/_ssl3.py", line 474, in send
    return self._sslobj.write(data)

OperationalError: (2006, "MySQL server has gone away (BrokenPipeError(32, 'Broken pipe'))")
(11 additional frame(s) were not displayed)
...
  File "contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "sfa_api/utils/storage_interface.py", line 168, in get_cursor
    connection.rollback()
  File "pymysql/connections.py", line 429, in rollback
    self._execute_command(COMMAND.COM_QUERY, "ROLLBACK")
  File "pymysql/connections.py", line 771, in _execute_command
    self._write_bytes(packet)
  File "pymysql/connections.py", line 718, in _write_bytes
    "MySQL server has gone away (%r)" % (e,))
alorenzo175 commented 4 years ago

max_allowed_packet is now 128MB. I don't want to go too large and slow the database from responding to other requests.

100MB might be too big due to the time it takes to process it so that the request times out.

lboeman commented 4 years ago

Sweet. Thanks for updating that. I'll poke it and try to get it tuned and add some reasonable responses.

alorenzo175 commented 4 years ago

243 and #245 should mostly handle this. Very large raw reports may still fail, but we may want to deal with them separately