While deploying to staging we realized that in case of stream retries, the xdsrelay pods were getting oomkilled.
The OOM was happening due to goroutine leak in the upstream client.
In order to fix it, we used go.uber.org/goleak (doc) in the test cases, and fixed the leaks using test cases. The resulting code does not OOM kill anymore.
While deploying to staging we realized that in case of stream retries, the xdsrelay pods were getting oomkilled. The OOM was happening due to goroutine leak in the upstream client.
In order to fix it, we used
go.uber.org/goleak
(doc) in the test cases, and fixed the leaks using test cases. The resulting code does not OOM kill anymore.Signed-off-by: Jyoti Mahapatra jmahapatra@lyft.com