Open mreid-moz opened 3 years ago
If we implement #30, people could self-serve google analytics on subdomains without too much difficulty.
It's possible to enable logs on individual gcp buckets via terraform (see https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket#logging and https://cloud.google.com/storage/docs/access-logs). These logs contain the ip address and the object that was requested, dumped about once a day. It would require the bucket to be managed by ops, but something like the following could happen:
The query would look something like this, assuming analysis.protodash_usage
is contains bucket usage logs.
WITH
extracted AS (
SELECT
TIMESTAMP_MICROS(time_micros) AS timestamp,
cs_bucket,
c_ip,
FROM
analysis.protodash_usage )
SELECT
TIMESTAMP_TRUNC(timestamp, day) AS timestamp_day,
cs_bucket,
COUNT(DISTINCT c_ip) AS n
FROM
extracted
GROUP BY
1
ORDER BY
1
This topic came up again yesterday in the context of the numbers that matter dashboard. Knowing counts of numbers of visits is helpful, but it is sometimes important to understand who is using a particular resource. For example, if a dashboard is aimed at high-level decision-makers, we would want to know if they (or someone reporting to them) is looking at it.
Google Analytics has a "user id" feature which we could associate with the login on authenticated dashboards (which probably correspond to the cases where we'd want fine-grained analytics on who is accessing stuff and when):
https://support.google.com/analytics/answer/3123662?hl=en https://www.lovesdata.com/blog/google-analytics-user-id
Audit logging of the resources (server-side) would probably have a higher fidelity, especially if ad-blockers are being run which interfere with GA's data collection. The audit log object includes authentication information for bucket access that's behind IAM of some sort.
I've set up something simple for page visits to https://protosaur.dev/mps-deploys/ in this data-sandbox-terraform PR. The logs are written to a bigquery table pretty much instantaneously, although the principal that is logged is the protodash service account (i.e. the request is being proxied). If the audit logs are set up in the main protodash project, it's likely we can count page loads by authenticated user.
I've set up something simple for page visits to https://protosaur.dev/mps-deploys/ in this data-sandbox-terraform PR. The logs are written to a bigquery table pretty much instantaneously, although the principal that is logged is the protodash service account (i.e. the request is being proxied). If the audit logs are set up in the main protodash project, it's likely we can count page loads by authenticated user.
This seems like a good way forward if we can make it work.
This would help us understand how much prototypes are being used.