dotnet / aspire

An opinionated, cloud ready stack for building observable, production ready, distributed applications in .NET
https://learn.microsoft.com/dotnet/aspire
MIT License
3.37k stars 350 forks source link

Aspire Dashboard - Persist telemetry data #4256

Open sharpSteff opened 1 month ago

sharpSteff commented 1 month ago

Currently the telemetry data is hold in a cycular buffer in memory. It would be great to have the option to persist the data.

drewnoakes commented 1 month ago

@sharpSteff can you elaborate a little more on the scenario here? Are you focussed on the local dev loop, or a hosted dashboard? What about data that's missed while the dashboard is offline/restarting?

Drsela commented 1 month ago

@drewnoakes I would like to elaborate on this feature request, as it would be very beneficial for my use case.

Using the Standalone Aspire Dashboard, and having multiple microservices use OTEL to send data to this dashboard is vital for log aggregation and metrics.

However, server maintenance often requires patching and rebooting, which results in the loss of all data stored in memory by the Standalone Aspire Dashboard as mentioned here. This can be problematic for production scenarios.

This feature would help prevent the loss of log files during such reboots. While I don't expect the Standalone Aspire Dashboard to retain logs indefinitely, a grace period of 30-90 days would suffice for most production scenarios. This feature would also prevent data loss when updating the Docker image, since that requires the container to be restarted.

I wrote to @davidfowl on LinkedIn and he mentioned that there are currently no short-term plans for long-term storage ;-)

davidfowl commented 1 month ago

I think this is the most reasonable reason to support for optionally persist telemetry, reboots. That said, we do not want to build a data storage model optimized for querying logs, metrics and traces. My thought is that we would support a best effort flushing telemetry to disk on graceful shutdown (or on some interval) to handle the survival of reboots/upgrades. This isn't a scalable persistent store optimized for queries (this is where real APM systems excel).

To be clear on the intent here:

sharpSteff commented 1 month ago

@Drsela sums it up quite nicely. Aspire Dashboard is sufficient for my otlp usecase. I like to see the last couple of months of data and don't want to cry when my host has to restart. I do not need separated otlp-collector and db to ensure every bit of data. I use it more like to show trends.

davidfowl commented 1 month ago

I do not need separated otlp-collector and db to ensure every bit of data. I use it more like to show trends.

How are you going to get trends with the dashboard? We're not adding those features.

sharpSteff commented 1 month ago

I do not need separated otlp-collector and db to ensure every bit of data. I use it more like to show trends.

How are you going to get trends with the dashboard? We're not adding those features.

I added myself a custom sub-page presenting the data

mitchdenny commented 1 month ago

Are you running a fork of the dashboard?

Drsela commented 1 month ago

My thought is that we would support a best effort flushing telemetry to disk on graceful shutdown (or on some interval) to handle the survival of reboots/upgrades

This would be sufficient for our use case :-) Right now we're using a self-developed dashboard that are reading NLog (rolling windows + 7 days) files and visualises them. We also use Application Insights for 'true' APM .

Replacing our own dashboard with standalone Aspire would be awesome. And flusning the telemetry to disk on graceful shutdown would be a great way for us to migrate to standalone Aspire.