elastic / observability-docs

Elastic Observability Documentation
Other
36 stars 164 forks source link

Update tutorial about monitoring Java app to show Elastic Agent #2974

Open dedemorton opened 1 year ago

dedemorton commented 1 year ago

The tutorial needs to be tested and updated to show Elastic Agent instead of Beats.

TODO list:

dedemorton commented 1 year ago

I made a pass at trying to update the app, but it's challenging because I have no familiarity with most of the dependencies except Java. To do this work, I'll need a Java developer to:

Here are some detailed notes that I took while I was trying to work through the tutorial.

Step 2: Create a Java application

Rather than asking users to follow the steps to create an app, can we instead provide a starter app? (The goal of this tutorial is not to teach users how to write Java apps.)

Ideally the app would be “owned” by someone on the development team who could keep it up-to-date with changes to dependencies. The current tutorial hasn’t been updated in 3 years and refers to deprecated versions. It may even use some tools that have better alternatives now. I need a developer to help me sort out some basic questions like...do we want to continue using these dependencies and, if so, which versions:

Step 2: Ingest logs

Ideally the logging implementation would be set up in our starter app, and we would walk users through the relevant parts of the code rather than telling them to build it all from scratch.

Install and configure Filebeat

This section should be updated to describe how to use Elastic Agent rather than Beats.

Step 3: View logs in Kibana

When I read the following paragraph, I wonder if we're making this hard for users:

“You can see there is a flaw in the request logging. If the user agent is null, something other than null is returned. Reading our logs is crucial; however, just indexing them gains us nothing. To fix this, here is a new request logger.”

Why can't we just show them the example of checking for a null user agent in the code up front and explain why it's important?

Step 4: Work with your logs

Structure logs

What approach do we want to recommend for Elastic Agent users? Would they use the Custom Logs integration?

Parse exceptions

Again, I think it might be good enough to show the code in an existing starter app rather than expecting users to add the code for logging exceptions. Would Elastic Agent users need to write multiline configs? Oh heavens.

Configure log rotation

Why isn’t this covered under the section about adding the logging implementation? Is seems like this is another example of why providing a pre-built starter app would help users save time.

Ingest node

I understand what we're trying to illustrate here, but I think it could be really confusing to users to learn all the different ways to parse the data and to see how to create ingest pipelines that replace their processor configs. Wouldn't it be better to suggest a best practice?

Write logs as JSON

It sounds here like we've just made users jump through some unnecessary hoops: “Writing out logs as plain text works and is easy to read for humans. However, first writing them out as plain text, parsing them using the dissect processors, and then creating a JSON again sounds tedious and burns unneeded CPU cycles.”

I do see a value to teaching users about things like dissect and ingest pipelines, but it makes this tutorial pretty long and forces users to do stuff that's tedious and wasteful. Is there some reason why we shouldn't simply tell folks to use the ecs-loggin-java library? We could mention that there are other ways to parse log events without showing examples of all the ways.

Or at the very least, we could tell users that the other steps are optional.

Ingest metrics

Add metrics to the application

Again this is something specific to implementing the app, but I’m starting to see the value in showing the app code close to the docs where you talk about ingesting metrics. I still think we could use a starter app, though.

The dependencies in this section also need to be updated.

Install and configure Metricbeat

This section would be replaced by Elastic Agent steps or I guess possibly open telemetry? (I don't really know anything about otel at this point....

Step 6: View metrics in Kibana

Probably just need to update the steps here.

Step 7: Instrument the application

There’s a lot here…it’s not in my wheelhouse. Someone knowledgeable about APM will need to update the app dependencies and verify that the steps in this section are accurate and still considered best practices.

Step 8: Ingest Uptime Data

Should this section cover synthetic monitoring instead of Uptime?