DavidWells / analytics

Lightweight analytics abstraction layer for tracking page views, custom events, & identifying visitors
https://getanalytics.io
MIT License
2.42k stars 244 forks source link

Consent-aware storage mechanism #346

Open dobesv opened 1 year ago

dobesv commented 1 year ago

Currently analytics by default stores a field __anon_id into localStorage. However, with GDPR we don't want to store any persistent information until the user gives consent to do so.

We can prevent this by passing in a storage option with a transient storage type (e.g. using @analytics/global-storage-utils).

However, if the user later gives consent, we would want to transition over to using permanent storage.

It would be very helpful if the library was "batteries included" in the sense that it provides an easy way to do this, perhaps with a dedicated consent module.

Initially it could use global storage. But if consent is granted to store tracking cookies, it would copy the relevant data from the global variable storage to localStorage and start using localStorage after that.

It may also be helpful if the library used a different storage options for different kinds of consent.

For example, in Google Tag Manager they have ad_storage, analytics_storage, functionality_storage, personalization_storage, and security_storage that are configured and consented to separately.

I'm not sure if any of the analytics plugins currently would benefit from this distinction but they might.

If the analytics object had 5 different storage objects and each had a flag indicating whether it had consent or not, and plugins could get an event when storage consent changed, they could make use of that to propagate consent state to their upstream library or other make use of this information to adjust their storage related behavior.

dobesv commented 1 year ago

Here's a module I create for our project to implement some of this:

https://gist.github.com/dobesv/9022c95d39461eaa05e9f2125d72a203

dobesv commented 1 year ago

Newer version:

https://gist.github.com/5bb7d83e1380560042b0f53b9978b697

dobesv commented 1 year ago

I think an API like this might make sense:

// Initialize with consent
analytics = new Analytics({
  plugins,
  storageConsent: { ads: false, performance: false }
});

// Specify the user's consent settings.  Consent is generally false by default.
// When consent is granted, items are moved from global variable to persistent storage
// When consent is revoked, items are moved from persistent storage to global variable
// Triggers an event that plugins or event listeners can listen for and apply if they support it
analytics.setStorageConsent({
   ad: true,
   analytics: true,
   performance: false,
   tracking: true
});

// Get the storage consent object that was last set
analytics.getStorageConsent();

// Have an event related to consent available for analytics.on and for plugins as well
// GTM & Google Analytics plugins can be updated to pass the consent settings to gtag
// to configure the storage in the library
analytics.on('setStorageConsent', ({oldStorageConsent, newStorageConsent, instance, ...}) => {
});

// Storage calls specify what type of data is being stored
// When consent is not granted for at least one of the given categories, item is stored in global variable
// For backwards compatibility it could default to not checking for consent if no consent type is provided
// ANON_ID and USER_ID would be stored requiring 'tracking' or 'analytics'
analytics.storage.setItem('some_key', 'some_value', ['ads', 'analytics']);
dobesv commented 1 year ago

Example code to apply consent to google scripts using gtag: https://gist.github.com/0dba69925b8975e69b3392da46063db2

joe-mohan commented 1 year ago

I'm implementing this functionality by only initialising Analytics after consent has been given, can anyone see any potential issues? this is in a Vue project, on page mounted, after prompting the user to consent i then initialise analytics:

const analytics = Analytics({ // app: 'alienworlds', plugins: [ googleAnalytics({ measurementIds: ['G-XXXXXXX'], }), ], }) analytics.page()

Seems to be firing fine

dobesv commented 1 year ago

With Google Analytics, Google Tag Manager, and some other libraries you can tell them to load without storing any cookies or using localStorage. They just send the events with a non-persistent session ID in them. If you later give storage permission, they will store the session ID so that the user's actions are tracked across pages and reloads.

It is nice to be able to see some analytics for users who did not accept cookies as you can still at least see page view counts, browser usage, and data like that for those users.

dobesv commented 1 year ago

This ticket actually just requests that the analytics library itself has a similar mechanism - it can load without storage consent and just store the anonymousId / userId in memory. Later if consent is given it can use localStorage. In principle some plugins for it could make use of the same mechanism instead of making their own.

DavidWells commented 1 year ago

I like this idea @dobesv https://github.com/DavidWells/analytics/issues/346#issuecomment-1331674180

I need to implement some additional GDPR features coming up in the new year... It will likely be in the form of a plugin.

The hard part is mapping plugins to the cookie, localStorage, etc they drop. Those change quite often and it wouldn't be feasible to hard code them into plugins. This is why stuff like https://www.cookiebot.com/ exist as they scan your site & help categorize the cookies

I'd love a generic way to do this but not sure if its possible without a scanner like the tools out there on the market.

https://github.com/orestbida/cookieconsent looks promising but works in a funky way in how it loads scripts.

Ideally, people could just disable plugins by default and then present something like https://github.com/orestbida/cookieconsent model for users to opt into tracking. How that gets sliced and diced into categories is the tricky part. Maybe plugins just fall into a category & don't load any JS until the opt in is clicked. This is how the init & enabled flag https://getanalytics.io/conditional-loading/#how-does-it-work were originally designed

dobesv commented 1 year ago

I think if you provide an API that plugins can use as a standard way to define which types of consent the user has given and also get an event when the consent has changed it could go a long way.

Each plugin can decide how "aware" of these things it wants to be, if at all. The Google stuff is quite configurable now, but many scripts out there are kind of "all or nothing".

For the plugins that are not aware of consent settings, you could make a wrapper that wraps the plugin and does not initialize it until some kind of consent is given.

ianmartorell commented 1 year ago

I came to this package precisely looking for a standardised way to manage consent across multiple analytics solutions. It's a big pain to do since different solutions have different levels of consent granularity, or none at all, but the APIs proposed in this thread look great and would go a very long way to improve DX and mantainability when dealing with consent.

kmclaugh commented 1 year ago

Just throwing my two-cents in here. I think @dobesv is exactly right on the way to handle this. The Analytics library could provide an API to handle the storage types according to an API. In other words, when you tell the Analytics API that consent isn't granted, it stores everything in global. When consent is granted it promotes it to local.

I don't think it should integrate with cookie banners out of the box. It's the developer's job to interpret consent from banners correctly and pass that info to the Analytics API.

I'd be happy to work on this with anyone. For now, I'm just not loading Analytics when consent isn't granted per @joe-mohan 's solution