metal-stack / go-ipam

golang library for ip address management
MIT License
123 stars 42 forks source link

Is this lib stateless ready with Redis storage? #76

Closed dzvancuks closed 2 years ago

dzvancuks commented 2 years ago

I'm trying to implement IPAM in stateless environment using Redis as central DB. After analyzing redis module it seems that there will be racing and traffic congestion issues.

  1. To AcquireIP() library first calls ReadPrefix(). Then it locally fetches next available IP. After that UpdatePrefix() to write data back. Problem here that another node might have called ReadPrefix() with same state and will try to attempt same UpdatePrefix()
  2. UpdatePrefix() relates on 'optimistic lock' by checking version. On failure (other node changed version) the whole process will be started again up to 10 times. Here I see that in case there are many nodes trying to work with same CIDR there will be racing condition. In the end current node can easily stumble on 10 retries and fail to allocate IP. Also Redis locks data only on read-write operations. So between version Read and updated prefix Write operation another node might update same prefix data. This result that 2 nodes allocates same IP from CIDR pool and both will consider operation to be successful.
  3. Traffic congestion on lock checks, retries.

Library is OK for single node operation, but won't do on multinode due to lack of syncronization. Easy and dirty solution would be SETNX as a mutex. But this will slow operatons while other nodes wait for lock release.

There is also PubSub model. Where nodes can passively watch for key chanes. This way nodes can track updates and react faster on them.

And finally we should keep in mind latency. Extra checks and retries increases traffic. Instead it is more preferable to Pipeline or Script Redis commands.

majst01 commented 2 years ago

We actually run go-ipam with postgres backend with multiple instance and no problems because postgres has a good optimistic lock implementation we use.

I must admit that i am not an expert in Redis, but as far as i can tell the integration tests pass for redis, and in the integration test there are many tests for this exact szenario.

dzvancuks commented 2 years ago

Yes, I see that SQL storage uses transactions with commit/rollback action.

Updated title to emphasize Redis commands.

majst01 commented 2 years ago

I would appreciate a pull request which will ensure the same guarantees for the redis storage

dzvancuks commented 2 years ago

I have to admit I missed the Watch command. It triggers transaction with similar to commit or rollback operations.

Sorry for that. I'll close issue.