os-climate / os_c_data_commons

Repository for Data Commons platform architecture overview, as well as developer and user documentation
Apache License 2.0
20 stars 10 forks source link

Increase memory for trino workers #112

Open erikerlandson opened 2 years ago

erikerlandson commented 2 years ago

We have some evidence that queries are failing due to memory limitations on the trino workers. Proposal is to double the memory from its current setting of 4GB to 8GB. Investigate whether increasing the coordinator memory is relevant.

erikerlandson commented 2 years ago

cc @MichaelTiemannOSC @caldeirav

erikerlandson commented 2 years ago

Also need to check if we have to add any openshift nodes to accommodate extra resources.

MichaelTiemannOSC commented 2 years ago

I have a query that supposedly returns 9 rows:

qres = engine.execute("""
select count (*)
from sec_dera.sub as S join sec_dera.num as N on S.cik=1144800 and S.fy >=DATE('2018-01-01') and S.adsh=N.adsh and N.ddate>=DATE('2018-01-01') and abs(date_diff('day', fy, ddate)) > 366
""")
qres.fetchall()

[(9,)]

However, when I try to look at the nine rows:

qres = engine.execute("""
select S.fy, N.ddate, tag
from sec_dera.sub as S join sec_dera.num as N on S.cik=1144800 and S.fy >=DATE('2018-01-01') and S.adsh=N.adsh and N.ddate>=DATE('2018-01-01') and abs(date_diff('day', fy, ddate)) > 366
""")
qres.fetchall()

I get this:

---------------------------------------------------------------------------
TrinoQueryError                           Traceback (most recent call last)
<ipython-input-51-f3d48ff6bbc4> in <module>
      3 from sec_dera.sub as S join sec_dera.num as N on S.cik=1144800 and S.fy >=DATE('2018-01-01') and S.adsh=N.adsh and N.ddate>=DATE('2018-01-01') and abs(date_diff('day', fy, ddate)) > 366
      4 """)
----> 5 l = qres.fetchall()
      6 len(l)

/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/result.py in fetchall(self)
   1215             return l
   1216         except BaseException as e:
-> 1217             self.connection._handle_dbapi_exception(
   1218                 e, None, None, self.cursor, self.context
   1219             )

/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/base.py in _handle_dbapi_exception(self, e, statement, parameters, cursor, context)
   1466                 util.raise_from_cause(sqlalchemy_exception, exc_info)
   1467             else:
-> 1468                 util.reraise(*exc_info)
   1469 
   1470         finally:

/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/util/compat.py in reraise(tp, value, tb, cause)
    127         if value.__traceback__ is not tb:
    128             raise value.with_traceback(tb)
--> 129         raise value
    130 
    131     def u(s):

/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/result.py in fetchall(self)
   1211 
   1212         try:
-> 1213             l = self.process_rows(self._fetchall_impl())
   1214             self._soft_close()
   1215             return l

/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/result.py in _fetchall_impl(self)
   1161     def _fetchall_impl(self):
   1162         try:
-> 1163             return self.cursor.fetchall()
   1164         except AttributeError:
   1165             return self._non_result([])

/opt/app-root/lib64/python3.8/site-packages/trino/dbapi.py in fetchall(self)
    455 
    456     def fetchall(self) -> List[List[Any]]:
--> 457         return list(self.genall())
    458 
    459     def cancel(self):

/opt/app-root/lib64/python3.8/site-packages/trino/client.py in __iter__(self)
    441         # Subsequent fetches from GET requests until next_uri is empty.
    442         while not self._query.finished:
--> 443             rows = self._query.fetch()
    444             for row in rows:
    445                 self._rownumber += 1

/opt/app-root/lib64/python3.8/site-packages/trino/client.py in fetch(self)
    518         """Continue fetching data for the current query_id"""
    519         response = self._request.get(self._request.next_uri)
--> 520         status = self._request.process(response)
    521         if status.columns:
    522             self._columns = status.columns

/opt/app-root/lib64/python3.8/site-packages/trino/client.py in process(self, http_response)
    388         logger.debug("HTTP %s: %s", http_response.status_code, response)
    389         if "error" in response:
--> 390             raise self._process_error(response["error"], response.get("id"))
    391 
    392         if constants.HEADER_CLEAR_SESSION in http_response.headers:

TrinoQueryError: TrinoQueryError(type=INSUFFICIENT_RESOURCES, name=EXCEEDED_LOCAL_MEMORY_LIMIT, message="Query exceeded per-node user memory limit of 819.20MB [Allocated: 812.94MB, Delta: 6.47MB, Top Consumers: {HashBuilderOperator=812.94MB}]", query_id=20211219_122314_00099_bgxv3)

Am I missing something obvious?

erikerlandson commented 2 years ago

@MichaelTiemannOSC It looks like you are bumping up against this:

query.max-memory-per-node#

    Type: data size

    Default value: (JVM max memory * 0.1)

This is the max amount of user memory a query can use on a worker. User memory is allocated during execution for things that are directly attributable to, or controllable by, a user query. For example, memory used by the hash tables built during execution, memory used during sorting, etc. When the user memory allocation of a query on any worker hits this limit, it is killed.

https://trino.io/docs/current/admin/properties-memory-management.html#query-max-memory-per-node

This is actually big progress - the new memory configurations are causing your query to fail with a proper error exception message, instead of just crashing the worker nodes.

MichaelTiemannOSC commented 2 years ago

Have noticed that the field COREG in SEC_DERA.NUM is more important than first I thought. When COREG=NULL, it's a fact reported by the consolidated entity, which means one fact of its kind per report, which is what I generally expect. However, when COREG is not null, it means somebody else is reporting a fact about that company, and there can be many sources citing that fact. I'm trying to figure out more about what it means, but just wanted to say that I'm looking very differently at my joins knowing that in one case COREG IS NULL gives one hit and otherwise can give 7,000 or 941,000, depending. Fun!

erikerlandson commented 2 years ago

This is a good example of the duality of 1) providing sufficient resources 2) developing a deep understanding of the data and query behavior to design queries that are as performant as possible.

Regarding (1), I think it would be reasonable to move to 16GB of ram, and also maybe double query.max-memory-per-node and query.max-total-memory-per-node from their default values.

MichaelTiemannOSC commented 2 years ago

I'm not asking to raise limits willy-nilly. I totally concur that we need to better understand the data and query side as well.

That said, I also don't have much experience with database provisioning, nor what constitutes reasonable per-node maxima. Happy to watch and learn on that front.

HeatherAck commented 1 year ago

per Thomas - this as well as Jupyter notebook memory sizing needs to be increased