daytonaio / daytona

The Open Source Dev Environment Manager.
https://daytona.io
Apache License 2.0
8k stars 770 forks source link

Persistent logging #755

Open nkkko opened 2 months ago

nkkko commented 2 months ago

Is your feature request related to a problem? Please describe. The Daytona server occasionally crashes due to various reasons, and without persistent logging, diagnosing these issues is challenging.

Describe the solution you'd like Implement a persistent logging mechanism:

Tpuljak commented 2 months ago

We do write logs to a file and the initial idea was that daytona server logs just reads from the file so the logs are always available. In the meantime, we changed server logs so it gets the logs from the API so the server must be alive.

This is not ideal and we should give the user an option that will read straight from the file to bypass the server health check. This way the user can always get the logs in the event that the server failed.

Regarding log levels, that's already handled.

Log rotation is something we should manage, I agree.

SvenDowideit commented 1 month ago

I wonder if using open telemetry logging everywhere, with a default logging provider would create an abstraction that then lets both daytona server, and dev containers have magic logging by default (and thus the rotation, expiry and so on can be outsourced to someone else's code) - that then opens the door for all dayona based logging to be directed to the user's preferences too.

Tpuljak commented 1 month ago

lets both daytona server, and dev containers have magic logging by defaul

@SvenDowideit can you elaborate a bit more on this?

We've already setup all the logging (for the server, workspaces, etc.). The things that are missing are:

SvenDowideit commented 1 month ago

so my slightly shift-left thought is - what if instead of implementing logging features in Daytona, Daytona sets up Grafana Loki (by default, but allows the admin to configure other options), and uses open telemetry logging. That way, Daytona doesn't need to implement log rotations, viewer, etc, and creates an ecosystem environment in which developers know they can assume a default otel logger is available to any dev env.

(an alternative thought is https://logdy.dev/ )

Tpuljak commented 1 month ago

so my slightly shift-left thought is - what if instead of implementing logging features in Daytona, Daytona sets up Grafana Loki (by default, but allows the admin to configure other options), and uses open telemetry logging. That way, Daytona doesn't need to implement log rotations, viewer, etc, and creates an ecosystem environment in which developers know they can assume a default otel logger is available to any dev env.

@SvenDowideit I like what you're aiming for. We do strive to make everything as extensible as possible and implementing otel logging while allowing the user to configure everything would be appealing.

Tagged the issue with Discussion to hear thoughts from the rest of the team and opened #772 to be solved in the meantime.

vedranjukic commented 1 month ago

First, we should do a small refactor to replace our custom file output logic with a logrus hook like https://github.com/rifflock/lfshook.

The server process already writes the logs to the stdout, and the user can stream them to the third-party service.