Open siewa001 opened 6 years ago
@siewa001 your issue is very interesting because we've got exactly the same issue. Two DCs, latency for us is even lower (0,3ms vs 1ms) and as soon as the client is not in the same DC as the master the communication is slower with a factor of at least 4x.
There is some other github issue mentioning multiple master/distributed metadata setups coming this year, but I've got no idea if it would actually help.
I also wanted to try out if NFS ganesha would fix it, but was not able to test it yet.
What I wanted to add: someone I know is having basically the same setup, and also the DCs are very similiar (also in regards to latency) but he's using MooseFS and apparently he's not seeing such issues. I thought LFS was not modified that much especially in this regard, yet, or am I wrong about that?
Nobody with a similar setup, or similar issues? Are we doing something wrong?
Maybe there is not enough detailed background info in your description on how you use LizardFS and deployment choices and tradeoffs you are comfortable with to start a meaningful conversation?
Consider how helpful would it be for the people who try to help you to know the following:
@borkd to be honest, I thought there are enough details in this thread already to start a meaningful conversation (despite the LFS version I'm running of course). But in regards to your questestions:
I hope I could add more valuable information, to find out if this is an issue in my setup, something that can be fixed with configuration or an actual bug. If I can do something to help debug it, just tell me what you need me to do.
Hi,
I'm running test setup with LizardFS in two data centers (A and B), the typical latency between them is ~10ms. The setup is the same in both DC: master (primary in DC A, shadow in DC B), chunkserver, client. I know that I can configure topology to force clinet B to talk to chunkserver B to improve performance. But what about communication between master and client? Because right now client B is talking to master in DC A and it dramatically slows down communication.
My questions:
After tests, I wanna go with a similar setup to large scale but first I have to solve the geo-communication problem, so any input / concept will be appreciated.