Open omnilinguist opened 8 years ago
Hi @omnilinguist! In most production environments we expect users to store serialized PlanOut language code (https://facebook.github.io/planout/blog/planout-language.html) and namespaces in some kind of database. To roll out a treatment to broader populations one simply allocates more segments of the namespace to that experiment (namespaces should also live in a db). If I am understanding your bonus question correctly this is trivially supported by universes. In general experimenters should be able to just instrument their code once to grab parameters from a namespace, and then no subsequent code changes to the native code base need to be made for any type of follow on experiment (assuming your future experiments don't require you to change any application logic).
If you haven't already you might want to also check out the PlanOut paper, which discusses management in more detail.
Re: databases -- we tried to leave it up to the developer to decide how you want to store stuff. The documentation and base APIs are written in a way to make it easy to get started but if you dig into the source code for the reference implementation you can see notes on how you'd want to store and cache things. In general most assignment procedures can and should be done in an online fashion (because it is faster, more reliable, and more reproducible), so that the only thing you need to store are the serialized experiments, segment->experiment mappings for namespaces, and some metadata. In some cases it is valuable to be able to query external services (eg to get information for gating, clusters to be used in randomization, or retrieve contexts for use with a policy), and we just introduced an API for adding those external services that we might write about in a blog post soon :)
@eytan, here's my next iteration of comments, in roughly the order I care about them (I am trying to process the framework as quickly as possible, but am just getting started so hopefully bear with me a bit):
getTreatment("my_experiment", unit1, unit2, unit3...)
for clients (or even to wrap the fetching of the auxiliary arguments in a service), and that the experiment->planoutJson mappings are stored in a db table as you mentioned. If there is no centralized service that wraps the experiment functionality, then every service will have to hit the db directly (perhaps with some caching as described below) and do all the experiment processing locally. In case some of these requests end up needing to make sideways calls to other services or datastores to get the additional parameters to pass into assign()
, then this will all be decentralized.
getTreatment(...)
code have to somehow dynamically create not only the subclasses of SimpleExperiment
but also the actual instances of them, and then call get()
on them? Ideally all this should be able to be hidden from the consumer so that all they need to do is call getTreatment()
with a given experiment name + additional arguments and return 1 or more parameters (maybe in JSON format). The way you responded suggested that this is doable somehow using the interpreter, so would it make sense to just wrap the interpreter in some interface (possibly in its own service) that abstracts all the experiment table db access and experiment processing from the clients?Interpreter
class is designed (specifically the fact that its constructor takes a serialization
argument) naturally suggests caching a single Interpreter
instance in memory per experiment configuration (as specified via planout JSON), so that if an experiment is dynamically updated then we just need to make another Interpeter
instance with the updated JSON, and otherwise we can just re-use the same instance and avoid the overhead of instance construction; does that sound right? Also, would you happen to know from experience whether this setup would be performant at very high traffic.WeightedChoice
operator. For example, let's say I have treaments "a", "b", "c" with initial weights of [0.1, 0.1, 0.8], and I want to ideally extend this to [0.3, 0.3, 0.4]. However this will result in the people that initially saw "b" now see "a", presumably because of the way that WeightedChoice
maps the configuration to the hashes (the first 0.3 would capture the groups initially seeing "a", "b", as well as the first 1/8 of the people initially seeing "c"). But, I think we can kind of hack around this by just setting the choices to ["a", "b", "a", "b", "c"] with weights [0.1, 0.1, 0.2, 0.2, 0.4], and here the people that originally saw "b" would continue to see "b" (let's say that "c" is a special case where it is ok to have them see different treatments over time). Does this seem right to you? The only major concern here might be whether these separate segments with the same choices might cause any problems (doesn't seem like it should since in the end the treatment should end up being the same, but just in case).I expect to possibly have more questions about Namespace
s later on, but that is not part of the initial version of what I am building. In any case, I am also trying to figure out some of this stuff as I go, but some expert pointers may be useful :)
To whom it may concern,
I found this project while investigating A/B testing frameworks, and while it seems to provide a great deal of the functionality which I am looking for, there is one big design question that seems to remain unanswered upon looking at the documentation and skimming through part of the implementation (actually two, but they are related):
Wondering how Facebook or other users get around these apparent limitations (maybe some more that I will think of later), but apart from these much of the rest of the system looks fairly clean yet sophisticated!