elastic / apm

Elastic Application Performance Monitoring - resources and general issue tracking for Elastic APM.
https://www.elastic.co/apm
Apache License 2.0
384 stars 114 forks source link

Updating the metadata hostname specification #517

Closed eyalkoren closed 3 years ago

eyalkoren commented 3 years ago

@elastic/apm-agent-devs we just got a report that the link from APM traces to host metrics may be broken if the hostname discovery is not done as specified hereby, so please take a quick look to see whether such a fix is required in your agent

Agent agent uses always the simple hostname value (no domain) Agent uses the new (not deprecated) intake fields Issue
Java Fixed with https://github.com/elastic/apm-agent-java/issues/2161
.NET https://github.com/elastic/apm-agent-dotnet/issues/1504
Node.js ✅ (os.hostname(), which uses the gethostname syscall)
Python ✅ : (socket.gethostname(), which uses the gethostname syscall)
Ruby ✅ (Socket.gethostname)
Go
PHP ✅ Agent uses PHP function gethostname which is a trivial wrapper around gethostname syscall)
apmmachine commented 3 years ago

:green_heart: Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

#### Build stats * Start Time: 2021-10-22T04:04:00.405+0000 * Duration: 3 min 18 sec * Commit: 6dc72c8037acc4a1f4ee043009e8fd03c38c063e

eyalkoren commented 3 years ago

Following the requirement to provide a more comprehensive hostname detection algorithm and make it as compatible as possible with beats, I did some digging with the beats team and changed the spec accordingly.

I don't have enough resources to asses how generic and comprehensive it is, other than seeing it working wherever I tried it. It is certainly not 100% bulletproof, but our goals are more modest for now. Any comments, proposals and enhancements are very welcome.

@elastic/apm-agent-devs - please take a look, even if you did already.

trentm commented 3 years ago

Is using a language's standard library method for getting the hostname (e.g. socket.gethostname() for Python, os.hostname() for Node.js) acceptable? On unix-y systems these use the gethostname syscall. On Windows (sigh), Node.js and Python use different Windows APIs for this:

Executing uname and possibly hostname will add some cold-startup time for serverless (e.g. Lambda) envs.

basepi commented 3 years ago

@trentm I think the consensus is that the gethostname syscall is acceptable, and doesn't use DNS.

On the Windows end of things, I've been banging the drum for our python agent to drop Windows support anyway because we have no confirmed Windows users. So I'm not stressing about that side, at least for Python.

eyalkoren commented 3 years ago

@trentm I think the consensus is that the gethostname syscall is acceptable, and doesn't use DNS.

👍 Avoiding any API that involves NDS lookup is indeed the main point here.

This is tricky indeed. Eventually, what we need is a language-independent spec. That's the only reason I chose to put it this way, but everything that can be done without invoking an external process is of course preferable, as long as it yields the same result. I will add inline comments saying that equivalent language-specific APIs can be used. Anyone that wishes to extend that to include concrete examples is very welcome. We may eventually "backport" the spec we come up here to ECS, so any input you put here would be useful.

Lastly, I just learned that the currently proposed algorithm can either return the simple hostname OR FQDN, so I will probably extend the algorithm a little bit to handle that.

eyalkoren commented 3 years ago

Added some clarification on what can be used to replace invocation of external commands and extended the algorithm to clear domain parts if the discovered value is a FQDN.