The idea of this document is to describe a feature management system that will answer requests from the wallets with the correct feature flag variable depending on their identifier
Motivation
We need this mechanism to be able to safely roll out the new Wallet Service to the wallets, segmenting users with a controlled percentage so we have more control of exposure
The idea here is to have a simpler version of Optimizely
Guide-level explanation
For an initial implementation, these are the features that we need to implement:
Select a percentage of the user base for each feature flag
Use the user wallet's first_address to uniquely identify users
Return a boolean randomly but respecting the rollout percentage
Return the same value every time the same user (identified by the first_address requests the same feature flag
We need to have an environment for each feature flag, e.g. production, development
The API to request the feature flag value is as described:
GET /features/<environment>/<feature_flag>/<identifier>
Response
StatusCode: 200
Body:
{
"value": true
}
Response on not found
StatusCode: 404
Body: empty
Decisions:
We will not have an audience table. The identifier of the context will be in the FeatureFlag identifier string. E.g.: mobile-wallet_service_rollout
We will have a version column on the UserFeatureFlag and the FeatureFlag tables that increases every time a FeatureFlag is updated. This is used so we will re-calculate the FeatureFlag for the users (on the stored value) every time the FeatureFlag percentage changes.
Reference-level explanation
Database design
UserFeatureFlag
identifier - User identifier, e.g. first_address
feature_flag - The feature_flag identifier, e.g. mobile-wallet_service_rolloutenvironment - The environment for this stored feature_flag, e.g.: production or stagingvalue - The stored value for this feature flag for this user, this is randomized at the first request and then stored so we always respond the same value for the same user
version - The FeatureFlag version of this UserFeatureFlag
FeatureFlag
identifier - The feature flag identifier, as a string. E.g.: mobile-wallet_service_rolloutpercentage - The percentage to use when deciding if the user
version - This column is incremented every time the row is updated. This is used so we know when to invalidate the UserFeatureFlag for each user on request.
Architecture design
Lambda
Since this service will be hit every time an user opens an Hathor Wallet, it needs to be scalable enough to handle usage spikes
One potential drawback from using lambda is a risk of the cost getting exponentially high, but that can be prevented by always using hard values on functions timeout configurations.
We can also rate limit the APIs as the wallets will fallback to the old facade if they can not reach our feature management APIs.
Redis
The data we store is small enough to fit in memory, even with a large userbase, so I think Redis is the best option to deliver fast response times.
RDB (Redis Database): The RDB persistence performs point-in-time snapshots of your dataset at specified intervals.
AOF (Append Only File): The AOF persistence logs every write operation received by the server, that will be played again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself, in an append-only fashion. Redis is able to rewrite the log in the background when it gets too big.
No persistence: If you wish, you can disable persistence completely, if you want your data to just exist as long as the server is running.
RDB + AOF: It is possible to combine both AOF and RDB in the same instance. Notice that, in this case, when Redis restarts the AOF file will be used to reconstruct the original dataset since it is guaranteed to be the most complete.
I think it is fine to use the RDB persistence as losing UserFeatureFlag records (on a forced restart, for instance) is not a catastrophic failure -- the user will just get another randomly generated response from the server on the next request
The conclusion was that we were going to use Optimizely as we already had a PoC working, but we found an incompatibility on our react-native version with their library and decided to take a step back and design this simplified solution to have more information before deciding.
I think that this initial implementation is simple enough to be developed and ready in a single sprint to fulfill our current need -- rollout the wallet service initially.
Summary
The idea of this document is to describe a feature management system that will answer requests from the wallets with the correct feature flag variable depending on their identifier
Motivation
We need this mechanism to be able to safely roll out the new Wallet Service to the wallets, segmenting users with a controlled percentage so we have more control of exposure
The idea here is to have a simpler version of Optimizely
Guide-level explanation
For an initial implementation, these are the features that we need to implement:
first_address
to uniquely identify usersfirst_address
requests the same feature flagenvironment
for each feature flag, e.g.production
,development
The API to request the feature flag value is as described:
GET /features/<environment>/<feature_flag>/<identifier>
Response
StatusCode:
200Body:
Response on not found
StatusCode
: 404Body
: emptyDecisions:
We will not have an
audience
table. The identifier of thecontext
will be in theFeatureFlag
identifier string. E.g.: mobile-wallet_service_rolloutWe will have a
version
column on theUserFeatureFlag
and theFeatureFlag
tables that increases every time aFeatureFlag
is updated. This is used so we will re-calculate the FeatureFlag for the users (on the stored value) every time theFeatureFlag
percentage changes.Reference-level explanation
Database design
UserFeatureFlag
identifier
- User identifier, e.g. first_addressfeature_flag
- The feature_flag identifier, e.g.mobile-wallet_service_rollout
environment
- The environment for this storedfeature_flag
, e.g.:production
orstaging
value
- The stored value for this feature flag for this user, this is randomized at the first request and then stored so we always respond the same value for the same userversion
- TheFeatureFlag
version of thisUserFeatureFlag
FeatureFlag
identifier
- The feature flag identifier, as a string. E.g.:mobile-wallet_service_rollout
percentage
- The percentage to use when deciding if the userversion
- This column is incremented every time the row is updated. This is used so we know when to invalidate theUserFeatureFlag
for each user on request.Architecture design
Lambda
Since this service will be hit every time an user opens an Hathor Wallet, it needs to be scalable enough to handle usage spikes
One potential drawback from using lambda is a risk of the cost getting exponentially high, but that can be prevented by always using hard values on functions timeout configurations.
We can also rate limit the APIs as the wallets will fallback to the old facade if they can not reach our feature management APIs.
Redis
The data we store is small enough to fit in memory, even with a large userbase, so I think Redis is the best option to deliver fast response times.
Redis offers different persistence options (from: https://redis.io/topics/persistence):
I think it is fine to use the RDB persistence as losing
UserFeatureFlag
records (on a forced restart, for instance) is not a catastrophic failure -- the user will just get another randomly generated response from the server on the next requestRationale and alternatives
We have previously discussed alternatives in https://github.com/HathorNetwork/internal-issues/issues/11
The conclusion was that we were going to use Optimizely as we already had a PoC working, but we found an incompatibility on our react-native version with their library and decided to take a step back and design this simplified solution to have more information before deciding.
I think that this initial implementation is simple enough to be developed and ready in a single sprint to fulfill our current need -- rollout the wallet service initially.