Product usage analytics helps us understand how Kedro is used. This information helps us determine if we have succeeded in developing certain features and gives us a guiding point for identifying if we must improve our approach.
We shipped the first version of Kedro-Telemetry to understand the usage of the CLI and Kedro-Viz. However, we're still missing some high-level information like:
How many Kedro users do we have? The research question is, "How many users identified by username have run at least one CLI command?"
How many users of Kedro-Viz do we have? The research question is, "How many users identified by username have opened Kedro-Viz or run the kedro viz CLI command?"
How many users of Kedro-Viz experiment tracking do we have? The research question is, "How many users identified by username have opened Kedro-Viz and have opened the runsList or experiment-tracking pages on Kedro-Viz?"
How many projects are using Kedro? The research question is, _"How many projects identified by project_name have had someone run at least one CLI command from that project?"_
All of these values assume that Kedro-Telemetry is installed and activated according to our consent-based workflow.
Context
Some of the complexities of why it is difficult to do this might lie in defining user identities on Heap.
On the Heap, a User ID is created, and it is unknown if this field has a 1-to-1 mapping to our username field collected from Kedro-Telemetry. This field generates built-in charts on the Number of Users.
And on our side, we send all of our user properties like username or even project_name as event properties when they might be user properties.
Possible Implementation
There are two parts to this:
How do we have a consistent user identifier on Heap? Can we use the User ID field? Can we send username to replace User ID?
How do we make it possible to create a summative view of projects on Heap? Which may have to look at adding project_name or another project identifier to user properties.
Re: Point 2. This would require some discussion about what is a user. Is a user consistently defined by their username or is a user a username AND project_name.
Description
Product usage analytics helps us understand how Kedro is used. This information helps us determine if we have succeeded in developing certain features and gives us a guiding point for identifying if we must improve our approach.
We shipped the first version of Kedro-Telemetry to understand the usage of the CLI and Kedro-Viz. However, we're still missing some high-level information like:
username
have run at least one CLI command?"username
have opened Kedro-Viz or run thekedro viz
CLI command?"username
have opened Kedro-Viz and have opened therunsList
orexperiment-tracking
pages on Kedro-Viz?"project_name
have had someone run at least one CLI command from that project?"_Context
Some of the complexities of why it is difficult to do this might lie in defining user identities on Heap.
Here is what I have observed:
User ID
is created, and it is unknown if this field has a 1-to-1 mapping to ourusername
field collected from Kedro-Telemetry. This field generates built-in charts on theNumber of Users
.username
or evenproject_name
as event properties when they might be user properties.Possible Implementation
There are two parts to this:
User ID
field? Can we sendusername
to replaceUser ID
?project_name
or another project identifier to user properties.