Open fmvilas opened 1 year ago
@fmvilas I suggest to target only the CLI as we might create a library out of this that can be used by other tools.
I suggest we target Studio instead since we'll make sure the library is browser-compatible too. I can see three libraries emerging from this work: one that's aware of the filesystem and another that's aware of the browser capabilities. In both cases, they share another one that's in charge of communicating with the metrics endpoint and sending the information in the right way.
I think better would be to convert it into a discussion. since we enabled discussions, issues in community repo usually relate to community repo works
I think better would be to convert it into a discussion.
I agree. The point is that, afaik, Shape It does not have support for GH Discussions. I'm happy to either convert this into a discussion (you @derberg can do that, right?) and create a new issue for Shape It tracking, or rather the opposite.
Regardless of where (which project) we start collecting metrics from, I'm dropping here some caveats and ideas about the feature of showing the metrics publicly on the AsyncAPI website, which is the feature I consider it needs some investigation prework.
Regardless of using one service or another to collect metrics, they will have rate limits for queries.
Let's assume that we go with New Relic (where AsyncAPI has a free tier account).
All of those limitations could be removed if we avoid querying real-time metrics. Instead, to collect those metrics periodically, store them in a cache/DB/filesystem, and make the AsyncAPI website to fetch metrics from there instead of the metrics provider API directly. That implies a "product" decision of not having real-time metrics, but I believe it is completely acceptable. Would someone expect those metrics to be shown in real-time?
The technical details about how to achieve this architectural design are soon to come, but I just wanted to drop this here.
New Relic has it's own query APIs, that differ from the ingest ones. In fact, it has two: the one they promote, which is NerdGraph (GraphQL), and a REST API.
The last one, as it is kinda deprecated, does not have support for much operations, such as NRQL queries (queries to New Relic backend), meaning it is useless for our use case. So let's focus on NerdGraph API.
The rate limit for NerdGraph is 25 concurrent requests per user. That means no more than 25 requests to that API can be made at the same time. If we query NerdGraph on demand each time that new AsyncAPI website page (the one that will show the metrics) is requested, only 25 concurrent users will be supported, the rest will timeout and won't see metrics. Not a big deal assuming the traffic won't be that high now, but eventually could be, and this is for sure not resilient enough.
Additionally, there is another rate limit in place, and it is the NRQL rate limits. This rate limit is way more complex because it is a combination of:
For this part, I would love to find community members willing to work on the UI part. Any suggestion is more than welcome.
As a side note, New Relic provides a React component that lets you show their metrics by using their widgets. See https://developer.newrelic.com/build-apps/
For illustration purpose, I'm sharing a mermaid chart with the very big picture of the architecture this solution could have. Always assuming we use NewRelic as provider, but could be any other.
---
title: Measure AsyncAPI Adoption - big picture
---
flowchart LR;
subgraph Metrics visualization
NR[NewRelic]-- metrics --> AsyncAPIWebsite
AsyncAPIWebsite -- query metrics --> NR[NewRelic]
end
subgraph Metrics collection
Studio & CLI & Others-- metrics --> NewRelic
end
Considering the API rate limitations any provider will have in place (such as NewRelic, as I wrote in my previous comment, a "cache layer" should be in place. Again, no technical details about implementation (could be a service, a proxy, a serverless function...).
For some reason, GH mermaid lib is falling behind last releases and does not support rich texts. So I'm pasting the image instead:
@smoya is newrelic free tier enough to handle all our needs or we need to subscribe to a paid plan ?
@smoya is newrelic free tier enough to handle all our needs or we need to subscribe to a paid plan ?
The only constraint we should be aware of is data retention. In our case, our metrics retention (dimensional metrics/custom events) is 30 days for all raw data points. However, aggregated data retention is 13 months. See https://docs.newrelic.com/docs/data-apis/manage-data/manage-data-retention/#dimensional-metrics
Meaning we are not able to see in deep detail all data points sent > 8 days ago, but we can see 13 months of aggregated (1 min, for example). This is completely fine for us, as we do not really care about such a granularity.
After having a conversation via Slack, we ended up with the conclusion that we could just show New Relic widgets right directly from the public URLs that NR provides for each dashboard widget. That can be done through the UI of a NR dashboard. Each widget gives you a public URL that, when requested, shows the widget like in the following screenshot Embedding those instead of having to query New Relic API for fetching metrics simplifies a lot the architecture: We do not need that intermediate cache layer and we are not affected by the API rate limits.
That means, the big pic would look now like this:
For illustration purpose, I'm sharing a mermaid chart with the very big picture of the architecture this solution could have. Always assuming we use NewRelic as provider, but could be any other.
---
title: Measure AsyncAPI Adoption - big picture
---
flowchart LR;
subgraph Metrics visualization
NR[NewRelic]-- embeddable widgets --> AsyncAPIWebsite
AsyncAPIWebsite -- widgets public URL --> NR[NewRelic]
end
subgraph Metrics collection
Studio & CLI & Others-- metrics --> NewRelic
end
There is one additional concern and it is the fact our clients (Studio, CLI, ...) will be exposing the New Relic API Key (License Key) used for sending metrics. Both in source code (except web apps like Studio), and when executing the requests (by checking network traffic).
This secret leakage could be taken against us if someone wants to use it and send arbitrary data to our New Relic account. I think it is not necessary to go into detail about the possible consequences. See security practices https://docs.newrelic.com/docs/apis/intro-apis/new-relic-api-keys/#security-practices
There is one alternative solution we could implement, but it complicates the design a bit. It is about adding an intermediate service we own that would be in charge of forwarding the metrics to New Relic. This service will be the one holding that API Key and clients (Studio, CLI, etc) will send the metrics to that service instead of to New Relic directly).
Users might still hit that service by re-sending the same requests the client does, and pollute the metrics, but they won't be able to send any other different metrics or any other kind of data or operation over New Relic than the ones we allow on that service. Also, we could easily implement a check on the referer, if present, to allow only Studio domain to execute a request, limiting in that way the possibility of damage to CLI and any other non-web app.
---
title: Measure AsyncAPI Adoption - With metrics forwarder
---
flowchart LR;
subgraph Metrics visualization
NR[NewRelic]-- embeddable widgets --> AsyncAPIWebsite
AsyncAPIWebsite -- widgets public URL --> NR[NewRelic]
end
subgraph Metrics collection
Studio & CLI & Others-- metrics --> MetricsForwarder
MetricsForwarder -- metrics --> NewRelic
end
We can always go and try, trust in humanity and the fact nobody hates this project, which I'm fine with it :)
What about using Google Analytics? Have you considered it? AFAIK, there won't be many issues with rate limits. Also, exposing the token won't be an issue since it's something that also happens in the browser. I mean, not exactly a token but a GA ID. They also have a query param you can use so they don't track IPs (or at least they promise so 😄).
Leaving this here for reference: https://docs.newrelic.com/docs/apis/intro-apis/new-relic-api-keys/#key-details. We should have a look at Browser and Mobile App key options. They're essentially the same Google Analytics is providing.
Leaving this here for reference: https://docs.newrelic.com/docs/apis/intro-apis/new-relic-api-keys/#key-details. We should have a look at Browser and Mobile App key options. They're essentially the same Google Analytics is providing.
I already considered using Browser and it's in fact a good solution, even though focused on web apps. The good point of Browser is that you can also, IIRC, limit the referer to a list of known webpages. The cons is that we won't be able to do that for tools like CLI. But anyway, better to show publicly a browser key rather than the license key.
I'm gonna do a quick test and see how it behaves with a non webpage app. Coming back in a few.
Leaving this here for reference: https://docs.newrelic.com/docs/apis/intro-apis/new-relic-api-keys/#key-details. We should have a look at Browser and Mobile App key options. They're essentially the same Google Analytics is providing.
I already considered using Browser and it's in fact a good solution, even though focused on web apps. The good point of Browser is that you can also, IIRC, limit the referer to a list of known webpages. The cons is that we won't be able to do that for tools like CLI. But anyway, better to show publicly a browser key rather than the license key.
I'm gonna do a quick test and see how it behaves with a non webpage app. Coming back in a few.
As expected, Browser won't work with non-website apps. Just taking a look to the snippet you need to use for loading the agent, you can see window
object is being used.
We could use Browser for the Studio app in order to collect runtime metrics (performance, load times, etc) but this is out of the scope of this issue.
Browser is discarded. Mobile doesn't make sense IMHO since it's like APM but with another layer on top to unify frontend + backend.
The reality is that the solution in New Relic is to use the metrics API, the one I talked about in my previous comments. We can use GA as an alternative indeed. I'm not very much into sending custom metrics and querying them since last GA version anyway, so if you have a clear path that can save us investigation time, please share.
In fact, GA snippet code requires to be loaded in a website app. Not sure if there is a new alternative to that.
In fact, GA snippet code requires to be loaded in a website app. Not sure if there is a new alternative to that.
With https://developers.google.com/analytics/devguides/collection/protocol/ga4, It is possible to send events from any other source, but the way to do that is mostly the same as with New Relic; to send an HTTP request to a particular endpoint on their side and provide an API Key.
Just checked how Brew is doing it and got surprised. In the past they were using GA but now they're using https://influxdata.com. Have a look: https://github.com/Homebrew/brew/blob/HEAD/Library/Homebrew/utils/analytics.rb. It may be interesting to consider too.
Also, GA has the Measurement Protocol alternative which I don't think needs any secret to be exposed: https://developers.google.com/analytics/devguides/collection/protocol/v1/devguide?hl=en. That said, in some way or another, every service will ask you for a key. It doesn't matter to expose this key publicly if all you can do is send data. This is already possible from the browser console anyway and "nobody" is hacking it.
In the meantime we find the right metrics platform, I created a first POC on the shared library that will record the final metrics. It's functionality is very basic at this point but can help others to collaborate (cc @peter-rr) moving forward and start collaborating.
See it at https://github.com/smoya/asyncapi-adoption-metrics. You can see a usage example in the following test: https://github.com/smoya/asyncapi-adoption-metrics/blob/main/test/recorder.spec.ts
The next steps would be to create all required shortcut methods on that metrics recorder for all actions we think we could record. For example, the recordActionExecution()
method is meant for recording CMD executions like validate
but also could be an action in the Studio.
As I said, very POC stuff. There are TODOs, like the New Relic sink, which you will find pending work todo, like converting metrics to the New Relic format before sending, etc. Please feel free to ask any question!
@smoya great POC. Could you elaborate on the anatomy of the actions we record (e.g recordActionExecution()) are theses time series data ? I'm asking just in case we switch the metric platform (for instance to influxdata).
@smoya great POC. Could you elaborate on the anatomy of the actions we record (e.g recordActionExecution()) are theses time series data ? I'm asking just in case we switch the metric platform (for instance to influxdata).
That method its just an example of metrics we might want to collect. I.e. number of validate
CLI command calls. Metrics are still TBD, so i did add support only for GAUGE and COUNT timeseries metric types. But easy to add any other since this library is not handling metrics and its behaviour but just collecting and sending to any place via its sinks (i.e. New Relic).
You could tomorrow change from New Relic to any other timeseries provider by just creating a new sink. The rest won't change. I took that design decision based on the current situation where it is not yet clear which provider we will use due to the security concerns when exposing the api keys. In that way, we are not that blocked.
~Something I find challenging is imagining the metrics we want to collect in Studio. CLI is easy; counting each command execution (validate, generate, etc), extract data from documents, etc. However, the Studio is basically a live editor, where things happen in background, like re-parsing and validating, on every time you update the document. We might want to set some kind of limitation on those in order to get metrics that make sense to us. Deeply thought... what metrics do we need to ensure we know how users use the Studio?~
Moved into https://github.com/asyncapi/studio/issues/812
cc @fmvilas @Amzani
/progress 10 made a first POC version of the shared library that tracks metrics.
@fmvilas I have no permission to edit this issue. Would you mind replacing built points "Integration with CLI" and "Integration with Studio" with the following respectively?
Thanks!
@smoya done.
Open source alternative to Newrelic : https://signoz.io/
/progress 25 Updated the shared library so it can decorate metadata based on an AsyncAPI document - https://github.com/smoya/asyncapi-adoption-metrics/pull/2
/progress 35 Made a POC for CLI registering one metric https://github.com/asyncapi/cli/pull/859
This issue has been automatically marked as stale because it has not had recent activity :sleeping:
It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.
There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.
Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.
Thank you for your patience :heart:
@fmvilas @Amzani is still relevant?
cc @Amzani
Problem
Since the inception of AsyncAPI, we've been driving the project based on our opinion and perception of the reality. We don't really have visibility of what are the users doing with our tools and with the spec. Therefore, it's always hard to guess if a feature is successful or a complete failure (or somewhere in the middle :P). That's happening for both, the spec and the tools.
Solution
We should start measuring the usage of our tools. It is super important that we don't track any private data (including IPs). Whatever metrics we get, they should be available on our website so anyone can consume them.
The solution should be able to:
asyncapi validate
has been executed 1623 times this month", "60% of the documents are using version 2.4.0", etc.asyncapi validate
successfully, 40% runasyncapi generate
next, 20% runasyncapi validate
again, and the rest simply stop there. In other words, a funnel. Anyhow, the user should not be represented by any private data.Rabbit holes
Scope
Out of bounds
Success criteria