Netflix / genie

Distributed Big Data Orchestration Service
https://netflix.github.io/genie
Apache License 2.0
1.71k stars 367 forks source link

Remove read only from most query persistence API's #1156

Closed tgianos closed 2 years ago

tgianos commented 2 years ago

Given the unknown nature of how the system is deployed readOnly = true can behave in different ways.

For example with amazon aurora and the mariadb driver if you set readOnly = true it will send all the requests to a read replica endpoint which is subject to lag and inconsistency. There is no serializable isolation level. While the potential benefits here are nice it is not worth the risk of side effects both within the system codebase itself as modules are replaced with unknown implementations or to the REST API clients who expect consistent responses (e.g. read after write from HTTP 200 responses).

Removing readOnly = true from all but the non-critical search API's may slow down some performance due to JPA flush and context evaluation but the consistency gaurantees are likely worth the tradeoff for most of these queries which are point queries anyway to index backed columns.

coveralls commented 2 years ago

Coverage Status

Coverage remained the same at 93.771% when pulling 106cf53ffea7614abc9563bb1cebf4530fefc818 on tgianos:removeReadOnly into d38f58cfd3b242c29fec317b31b64be70623c608 on Netflix:4.1.x.