PostHog / posthog

🦔 PostHog provides open-source web & product analytics, session recording, feature flagging and A/B testing that you can self-host. Get started - free.
https://posthog.com
Other
21.98k stars 1.32k forks source link

Feature flags #517

Closed jamesefhawkins closed 4 years ago

jamesefhawkins commented 4 years ago

This is a big feature. Why it matters...

PostHog have built out a lot of functionality for product teams, and have made everything nicer for an engineering team to implement by being open source.

Being an "open source alternative to " is good, and has given us a ton of growth. Some companies do just this, like Mattermost who are described by their founder as "open source Slack".

In their case, it makes total sense - engineers use Mattermost every day. The majority of the world's engineers do not dig into product analytics. Obviously, lots do and that has given us a foothold so far.

For PostHog to become wildly successful, we need every engineering team to use us. As well as meeting the product team requirements.

For every engineering team to use us, we need to help the majority of engineers to get a sense of product usage in the tools they use every day, already:

If engineers get better at understanding usage, it gives them more autonomy over what they work on. For the companies they work for, it helps them grow.

Is your feature request related to a problem? Please describe.

As an engineer with a typical deployment process, I cannot easily roll out a new feature to a small group of users to test it out.

This means I don't know:

It also means:

Describe the solution you'd like

Describe alternatives you've considered

Neither of these gives much depth to product usage.

Additional context

How do we actually validate / get this live?

weyert commented 4 years ago

Feature flags support would be awesome, this would easily allow doing A/B testing and keep the analytics at one place :)

timgl commented 4 years ago

Frontend

I suggest a new item in the frontend called 'features' where you can CRUD new features. Each row in the main table should have an on-off switch.

Each feature should have

posthog-js

When calling /decide, it should return a list of currently active feature flag keys that apply to that specific user. A user should be able to call: posthog.isFeatureEnabled('show-sidebar'), which will return true/false.

Another super cool thing would be to be able to add this to random HTML elements, like <div data-posthog-feature-flag="show-sidebar">.

Each feature flag needs to be set as a super property for that session, in this case {$feature_flags_active: ['show-sidebar']} so that all events get tagged correctly.

Question 1: I think this means we have to update the JS snippet unless I'm wrong @mariusandra

Question 2: I'm not sure about using an array here. I think this can become really slow if we have a lot of cases. An alternative would be to do {$feature_flags_show-sidebar: true}, though that will cause a lot of random keys to appear. A third option is to do this as a separate JSON field, though I don't think that improves performance. A fourth option is to have a table that stores the keys against event IDs, though feels overkill.

Question 3: When we start adding this to the dashboard, the most awesome possible thing would be to be able to have our toolbar open, then to be able to flick feature flags on-off and see that reflect instantly. I think the only way to do that would be to use hooks which might be out of scope, but @mariusandra maybe you have better ideas?

backend

The /decide endpoint should get all active feature flags and work out for this specific user whether it applies or not:

This is just the implementation for posthog-js. Once we have people using that we can think about our other libraries, but shouldn't be too difficult to adapt.

macobo commented 4 years ago

Question: Users may block analytics tools for a variety of reasons, posthog is no exception. How would this feature work if the posthog.js is unavailable?

Question 2: Analytics scripts are usually loaded asynchronously, often after main application javascript has finished loading. If feature flags are exposed via posthog.js, could that cause issues? Would this require users to move posthog to the HEAD element always?

Question 3: Feature flags are also invaluable in the backend. If feature flags/groups are managed via posthog, can this feature be exposed via language client libraries?

Note: If you're looking for something open-source to draw inspiration from, then https://github.com/jnunemaker/flipper/ is great and the potential value-add of integrating something like it would be invaluable.

timgl commented 4 years ago

Great points!

  1. this isn't as much of a problem for our self hosted users (and is one of the benefits of PostHog) as it's hosted under their own domain, but is a problem for app. Optimizely just kind of throw their hands up and say that you should just use backend.

Obviously the function call itself shouldn't error, and just return 'default', that's something we can handle in the snippet.

  1. We already load async so this is definitely a good point. My idea was to store the active feature flags for that user in localStorage, so that if the /decide api isn't called at the time of isFeatureEnabled, it'll still give the right answer. On the very first page load this would be a problem. We could either make the function blocking/async (though this might slow things down too much, but again only on 1st page load) or just accept that first page loads are sometimes wrong.

  2. 100%. Idea is to get this out in frontend, see if people adopt it and then add this to all our other libraries.

macobo commented 4 years ago

Re 1, I think this might still be a challenge since even on a custom domain the script might eventually be blocked by privacy-concious users via e.g. ghostery.

Re 2, there's some considerations there - developers might assume that feature being on/off will be sticky across a session/page load, which in turn can lead to some unintended bugs. But this might be being paranoid.

But maybe no 1, no 2 are both solved by no 3 - if posthog makes it easy to expose the flags in the backend and has a unified database storing the state of the flags, benefits of providing a easy solution in the beginning outweigh the corner cases.

As users outgrow the simple solution they can then smoothly switch to a backend-based solution that can then be integrated back into the frontend in a way that helps work around issues. For example, I could see dumping the list of active feature flags into dom/global scope for posthog.js to read working in a relatively straight-forward way.

mariusandra commented 4 years ago

Hi @timgl and @macobo

  1. Unfortunately @macobo is right, self-hosting doesn't mean it won't get blocked. Someone somewhere adds */static/array.js to their list of naughty strings and we're out. I've heard stories of people having their own scripts on their own domains blocked because they were called "ads.js" or equivalent, so it's definitely possible.

Luckily I think that's only if you include posthog-js through a <script> tag. If you add posthog-js via npm and mangle it into your bundles, it's tricky to block anything. I'm not even sure what are the options here.

In any case we need to plan for the scenario where the script is not going to load and just return "default". For legit network errors if nothing else.

  1. Indeed, the third option would solve this. One way around the problem is to load the feature flags server-side. E.g. in ruby:
# somewhere in AppliactionController
posthog.identify(distinct_id: user_id)

# later in an .erb file
<script>
window.__POSTHOG_FEATURE_FLAGS__ = <%= posthog.feature_flags.to_json %>
</script>

... and then posthog-js could pick them up.

mariusandra commented 4 years ago

@timgl for your original questions:

  1. Yup, the js code needs to be updated.
  2. A separate table is probably overkill, especially if we ever want to migrate to some simpler database formats. A Postgres search for "array includes string" is pretty fast if you have the right index. I guess that applies to arrays inside json as well. Worse case we can make a separate postgres string[] (array) field.
  3. We can add a function to the js client, something like
    posthog.onFeatureFlagsChanged(() => { /* do something */ })

    .. and then it's up to the client to rehydrate their app.

Very cool though! The points you listed are rather straightforward to implement. It might take a while to get all client libraries up to date, but the basic feature could be done pretty fast!

fuziontech commented 4 years ago

I think this is a great start and provide a ton of value right off the bat. I agree that we should build this into the front-end as a first step, even if we push people towards using the npm package vs dropping it in via script. Step two will probably be mobile libraries since most of what developers will be using this for is tweaking user experience.

Eventually I definitely think enabling it on the backend through libraries is something that will need to happen, and should be relatively easy to implement.

Something that we should really think about here is user targeting for feature switches. One approach is the one like you mentioned with bucketing randomly and keeping them sticky to that bucket. Another really interesting feature would be targeting a switch to a Posthog Cohort. You could trigger a different experience based on some set of rules defined for that Cohort, like maybe try different copy on the product page after 5 product page visits. That could get really powerful.

I do like the idea of having the switches be a map since this could get to be a huge list of switches/decisions a user might have tagged to them. Checking whether they are in or out would be faster. It would also allow associating a value to the rule to make the switches not just boolean and a bit more powerful.

weyert commented 4 years ago

Really interesting, it would be great to have statistics which value a feature flags has returns so you can keep track how the feature flag are doing. Like X times true, Y times false and Z nothing got returned.

E.g. makes it easier to decide when it's time to delete the feature flag. Also maybe it would be a good idea to be able to make a feature flag short-lived by associating a expiration date with them and then automatically deletes it?

Personally, I think, we shouldn't worry about ad blockers, people decide to use ad blocks they should bear with any issues it may cause on web-sites like not getting offered the Beta website. Guess, in theory, you could make the name and path of the script customisable to avoid some ad blockers catching it for the not self hosted solution.

weyert commented 4 years ago

Love to assist in the development of this feature, would this be possible?

timgl commented 4 years ago

@weyert Absolutely, I just pushed up the branch with my current progress.

What's left to do is:

Are any of those interesting for you to work on? Or is there something else you fancy tackling? Also feel free to join our slack for easier comms.

weyert commented 4 years ago

Thank you, I have changed the Slack group. I am happy to discuss the subject. I ca imagine the decide endpoint would only be necessary for more difficult decisions which aren't static. For this we could just pull the feature flags on load and maybe poll for changes every one in a while, and keep a log of events and sent them in batches?

I am wondering something like sendBeacon (https://developer.mozilla.org/en-US/docs/Web/API/Navigator/sendBeacon) would be helpful here?