home-assistant / architecture

Repo to discuss Home Assistant architecture
319 stars 100 forks source link

Privacy preserving usage statistics #320

Closed ties closed 3 years ago

ties commented 4 years ago

unfortunately I do not have time to implement this myself due to a new job. However I wanted to get this idea out there.

Context

As mentioned during the state of the union meeting at the moment, statistics on the usage of home assistant components are only available to core developers due to privacy concerns. At the same time this data is extremely valuable and can help make informed decisions.

Proposal

Create a data publishing process using Statistical Disclosure Control for Microdata methods to anonymize data. This is the approach taken by national statistical agencies. Implementations in Python and R are available as well as GUI tools.

The anonymized data is less sensitive and can be periodically be used to provide a subset of this data on feature, platform and device usage.

Consequences

Data is made available for design decisions. However, this could also be provided through other means (without the design process for such a pipeline).

Positive:

Negative:

MartinHjelmare commented 3 years ago

We have https://analytics.home-assistant.io/ now.