Closed bobsilverberg closed 3 years ago
@jvillalobos suggested that a possible mitigation would be to enroll fewer users in future experiments so that the impact of this is lessened. Our current infrastructure does support this. For example, we could put 10% of users in cohort A, 10% in cohort B, and 80% excluded from the experiment.
One issue that wasn't mentioned in the description above is the fact that, in addition to not being able to send the GA event on the server, we also cannot write a cookie on the server as it interferes with our caching. That was actually the main reason we had to move to this client-side-only method.
One idea that @muffinresearch and I discussed in our 1:1 was to use redux for this on the server. The basic idea would be places the user into a cohort on the server (which we are currently not doing), and store that fact in the redux store. Then, on the client-side, check whether a user has been allocated to a cohort in a pending state, and if so, write the cookie and send the enrollment event, and then change the data in the redux store so this does not happen again.
There may be reasons this won't work, but it's worth discussing/investigating.
You'd enroll all users with the same cached response into the same cohort though, wouldn't you ? That may be fine, but it may also skew the results...
You'd enroll all users with the same cached response into the same cohort though, wouldn't you ? That may be fine, but it may also skew the results...
Oh right, caching... ☹️
There's also the impact of the client-side transformation potentially affecting the validity of the experiment. People in the experimental group will first see the same page as people in the control group then, if the experiment changes things in the viewport, there will be a flicker to the experimental version of the page. This skews the results.
Most client side A/B testing providers use an anti-flicker technique that hides the whole page until the A/B testing code has decided in which cohort to place people and performed the potential DOM transformations. This is also problematic for the validity of the experiment. Worse, this will affect the KPIs because the website will appear slower to 100% of the visitors during the duration of the experiment.
As the linked article suggests, server side transformations are the least inaccurate, least KPI disturbing way to do A/B testing. That can be done on the origin server or on an edge computing platform.
Thanks for this @rik. All of these are reasons why the experiment infrastructure was originally designed to make the changes on the server, and not client-side. It seems we have a lot of reasons to reverse our decision to defer all experiment code to run on the client, but, as @diox points out, we need to figure out how to deal with the caching aspect. Does anyone have any suggestions which would address all of these concerns? If not, I can schedule a meeting for us to brainstorm/discuss alternatives.
Can we design UI changes to not cause side effects? For instance, I read somewhere about the problem with the download button but we could have styled it so that the original button was taking the same "space" as the "new" button, which would have limited the visible impact on the page.
Can we design UI changes to not cause side effects? For instance, I read somewhere about the problem with the download button but we could have styled it so that the original button was taking the same "space" as the "new" button, which would have limited the visible impact on the page.
This is an interesting idea but...
A few additional thoughts/notes:
Thanks @diox. To document something else that was discussed in today's meeting; we don't need a solution for this right away, as the next experiment should have no visual differences between the cohorts, but it's likely something we'll need to figure out before the experiment after that.
There was some further discussion about this on Slack.
@diox suggested we look into the idea of rendering server side and letting it be cached, and suggested that this could be fine as long as we check how many users were actually enrolled in each side.
This led to a discussion that ideally we want users enrolled into a cohort for the duration of an experiment, which means keeping track of that via a cookie. We wouldn't really want the same user seeing UX for different cohorts based on what they were randomly served from the cache each time they return to the site.
In order to make this work we would need to add the contents of the cookie to the cache key (something we've discussed before, and isn't ideal, but I'm not sure what else would work), and then the nginx server would need to respect the value in an existing cookie when serving cached data. We also need to make sure that we only write the cookie on the client, as we discovered in the past that trying to write a cookie on the server breaks caching entirely.
I will file a separate issue for this investigation.
I have implemented a solution which addresses most of our issues, but I am not sure if it entirely addresses caching. There is a draft PR with code changes available at https://github.com/mozilla/addons-frontend/pull/10665. Here is what it does:
When a component which is wrapped with withExperiment
is instantiated (i.e., in the constructor
), the following happens:
Whatever variant is determined by the steps above is written to a private property in the component and then used in the render
method, which is what determines how the component will look based on the variant.
When the code for the component runs on the client (via componentDidMount
), we check to see if we need to add the experiment variant to the cookie (which is based on whether the experiment already exists in the cookie or not). If we do need to write a cookie we do two things:
We then dispatch another action to remove the variant from the Redux store.
I have tested this code locally and it seems to work for different combinations of server generated and client generated pages, with and without an existing cookie.
Currently it will only work for one experiment at a time, but it would not be difficult to update it to support multiple experiments running simultaneously.
I am not sure exactly how all this will work with caching, nor am I sure how to go about testing that.
@willdurand I'd appreciate your thoughts on this approach and also on the code at https://github.com/mozilla/addons-frontend/pull/10665.
@diox I'd appreciate hearing how you think this might work (or even whether it will work) with caching.
This was closed via mozilla/addons-frontend#10665
When we made changes to our experiment infrastructure for the most recent experiment, it was decided that we would defer all experiment code to run on the client. The reason for this is that sending events to GA is integral to making an experiment work, and we cannot send events to GA from the server. Some alternatives were investigated, but it was determined they were not adequate/were too hacky, etc., so we decided that the solution to this problem was to simply make sure that all code related to experiments which sends GA events runs only client-side.
This has resulted in some performance degradation when an experiment is running and a user first hits the site. The page they land on is rendered, and then some elements on the page might change as a result of the user being added to an experiment cohort and the UI being updated to match what is expected for their cohort.
@diox has raised this as an important issue, so we need to discuss alternatives to our current solution and the pros and cons of keeping what we have vs changing to a different solution. Among the things we should consider are the fact experiments are meant to be temporary (i.e., there likely will not always be an experiment running), as well as the fact that, for most experiments, we split the cohort between a control group and the experimental group, and only the experimental group should experience this performance hit.