Closed TapaiBalazs closed 1 year ago
Going to hold off on this for bit while I talk to other maintainers and communities on how they're doing this.
I found this documentation: https://developers.google.com/analytics/devguides/collection/ga4/events?client_type=gtag
And then there is this npm package: https://www.npmjs.com/package/ga-gtag
I believe we would be able to use google analytics for this
@dustinsgoodman
Upon further investigation, I discovered we could use the Measurement Protocol to send custom events to google analytics.
However, this would require us to bundle an API secret into the generator's source code, which is obviously not an option.
Therefore, for security reasons, we would still need an API endpoint (one serverless function maybe?) that gets called from the generator and hides the information from bad actors.
We discussed options in a different thread. In summary, setting up GA to accept this from a Node server needs to be investigated, but it should be doable. The Serverless Function should probably just take in an input of "kit name" and send the metric for measurement. This should restrict the type of damage bad actors can do.
Let's proceed with this latest proposal.
Is your feature request related to a problem? Please describe.
As the product owner, I'd like to be able to see statistics on how used our generator tool is, and which kits are the most used. _As the product owner, I'd like to see these statistics in google analytics
Tracking which kit is the most used can work if no personal data (PI or PII) is tracked. Our goal is to simply collect information on the most used kits. The data should be stored as custom events in Google Analytics, using the Measurement Protocol.
The GA Measurement Protocol requires a measurement id and an API secret. Since the API secret is a secret, we cannot hardcode it to our generator tool. For this, we need a secure endpoint which we can call, and the handler would call GA for us. One serverless handler should be sufficient and low-cost enough for this.
The generator script must not fail because of tracking logic, so any kind of error should be completely suppressed.
See the architecture diagram below:
Requirements
Acceptance criteria:
Original ticket contents:
Additional problems to consider:
Storing the data We can build our own infrastructure (a serverless function with dynamodb connection) or we could also use a third-party metrics system.
If we build our own infrastructure, it would be an additional feature to make a UI where the statistics can be viewed. If we go with a third-party metrics system (e.g.: Mixpanel), that also has costs but comes with everything we might need.
Describe the solution you'd like
The generator script has room for sending a request out to a hardcoded endpoint with a payload that can help us track information on which kits get used the most.
It might also pose a challenge, since native fetch is only supported on later nodejs releases, therefore, we might need to bundle a package which we could use for making the request (e.g.: axios).
If we use a 3rd party metrics solution, their client might has this solved. Either way every kind of error should be suppressed, so kit generation is not affected by our metrics information gathering.
Describe alternatives you've considered
GitHub Insights might give some insights on usage, but I don't know if it could be fine-tuned to detect which folders got cloned.
Additional context
No response