thisdot / starter.dev

A collection of kits to help bootstrap your next project.
https://starter.dev
MIT License
128 stars 16 forks source link

[Feature]: Allow maintainers to gather statistics on generator usage #790

Closed TapaiBalazs closed 1 year ago

TapaiBalazs commented 1 year ago

Is your feature request related to a problem? Please describe.

As the product owner, I'd like to be able to see statistics on how used our generator tool is, and which kits are the most used. _As the product owner, I'd like to see these statistics in google analytics

Tracking which kit is the most used can work if no personal data (PI or PII) is tracked. Our goal is to simply collect information on the most used kits. The data should be stored as custom events in Google Analytics, using the Measurement Protocol.

The GA Measurement Protocol requires a measurement id and an API secret. Since the API secret is a secret, we cannot hardcode it to our generator tool. For this, we need a secure endpoint which we can call, and the handler would call GA for us. One serverless handler should be sufficient and low-cost enough for this.

The generator script must not fail because of tracking logic, so any kind of error should be completely suppressed.

See the architecture diagram below:

metrics-tracking

Requirements

Acceptance criteria:


Original ticket contents:

Additional problems to consider:

Storing the data We can build our own infrastructure (a serverless function with dynamodb connection) or we could also use a third-party metrics system.

If we build our own infrastructure, it would be an additional feature to make a UI where the statistics can be viewed. If we go with a third-party metrics system (e.g.: Mixpanel), that also has costs but comes with everything we might need.

Describe the solution you'd like

The generator script has room for sending a request out to a hardcoded endpoint with a payload that can help us track information on which kits get used the most.

It might also pose a challenge, since native fetch is only supported on later nodejs releases, therefore, we might need to bundle a package which we could use for making the request (e.g.: axios).

If we use a 3rd party metrics solution, their client might has this solved. Either way every kind of error should be suppressed, so kit generation is not affected by our metrics information gathering.

Describe alternatives you've considered

GitHub Insights might give some insights on usage, but I don't know if it could be fine-tuned to detect which folders got cloned.

Additional context

No response

dustinsgoodman commented 1 year ago

Going to hold off on this for bit while I talk to other maintainers and communities on how they're doing this.

TapaiBalazs commented 1 year ago

I found this documentation: https://developers.google.com/analytics/devguides/collection/ga4/events?client_type=gtag

And then there is this npm package: https://www.npmjs.com/package/ga-gtag

I believe we would be able to use google analytics for this

TapaiBalazs commented 1 year ago

@dustinsgoodman

Upon further investigation, I discovered we could use the Measurement Protocol to send custom events to google analytics.

However, this would require us to bundle an API secret into the generator's source code, which is obviously not an option.

Therefore, for security reasons, we would still need an API endpoint (one serverless function maybe?) that gets called from the generator and hides the information from bad actors.

dustinsgoodman commented 1 year ago

We discussed options in a different thread. In summary, setting up GA to accept this from a Node server needs to be investigated, but it should be doable. The Serverless Function should probably just take in an input of "kit name" and send the metric for measurement. This should restrict the type of damage bad actors can do.

dustinsgoodman commented 1 year ago

Let's proceed with this latest proposal.