In isolation tests, we can not test queries sent from MX nodes.
In order to catch distributed locks within isolation test logic, we have implemented UDFs to understand that there is a distributed transaction exists which preventing our coordinator session from getting lock. Yet, that logic only works when the distributed transaction is started from coordinator. We have implemented our distributed deadlock detection to run separate processes from each MX node to make detection logic more scalable. Since we can not connect to worker nodes in the isolation tests, we can not understand that two different MX workers blocking each other from the coordinator node.
We are using get_all_active_transactions for getting the list of transaction which has been initiated from the coordinator, then by comparing the obtained transaction IDs with the results of dumb_global_wait_edges enables us to get the transaction id preventing the session from getting lock. In order to understand that two workers blocking each other, we need to have a kind of get_global_all_active_transactions to use it in our isolation tests.
Since we don't have such an infrastructure, we couldn't implemented intended tests in
Distributed deadlock PRs
Writing to reference tables from MX nodes PR (#2333)
Considering we are planning to improve capabilities of our MX nodes, having chance to test new functionalities in our isolation tests will definitely make following changes more robust.
In isolation tests, we can not test queries sent from MX nodes.
In order to catch distributed locks within isolation test logic, we have implemented UDFs to understand that there is a distributed transaction exists which preventing our coordinator session from getting lock. Yet, that logic only works when the distributed transaction is started from coordinator. We have implemented our distributed deadlock detection to run separate processes from each MX node to make detection logic more scalable. Since we can not connect to worker nodes in the isolation tests, we can not understand that two different MX workers blocking each other from the coordinator node.
We are using
get_all_active_transactions
for getting the list of transaction which has been initiated from the coordinator, then by comparing the obtained transaction IDs with the results ofdumb_global_wait_edges
enables us to get the transaction id preventing the session from getting lock. In order to understand that two workers blocking each other, we need to have a kind ofget_global_all_active_transactions
to use it in our isolation tests.Since we don't have such an infrastructure, we couldn't implemented intended tests in
Considering we are planning to improve capabilities of our MX nodes, having chance to test new functionalities in our isolation tests will definitely make following changes more robust.