qubole / rubix

Cache File System optimized for columnar formats and object stores
Apache License 2.0
182 stars 74 forks source link

Changes to fix Local Data Transfer Server connectivity issues. #390

Closed harmandeeps closed 4 years ago

harmandeeps commented 4 years ago
shubhamtagra commented 4 years ago

We also need some negative testing. Since it is difficult to deterministically create faults, you can write code in different places to randomly inject faults and test it with a long running test suite. E.g. randomly close channel before and after DataTransferClientHelper.writeHeaders in NonLocalRRC, randomly close channel while reading chunks in NonLocalRRC, randomly close connection from server at different places inside ClientServiceThread, etc. Add enough instrumentation to keep track of connections created, connections destroyed, connections on queue, etc and run a test suite submitting many queries in parallel for a period of few hours and at the end check for connection leaks, query failures.