elastic / elastic-agent

Elastic Agent - single, unified way to add monitoring for logs, metrics, and other types of data to a host.
Other
124 stars 134 forks source link

Trouble enabling Universal Profiling through HAProxy in a Elastic Cloud Cluster with Google Private Service Connect #3627

Open b2ronn opened 11 months ago

b2ronn commented 11 months ago

In our Elastic Cloud cluster, we use traffic filtering through Google Private Service Connect. We also have HAProxy configured on our side to use our domain names(apm.domain.com/kibana.domain.com/elastic.domain.com/fleet.domain.com/profiling.domain.com) for accessing services. Elasticsearch, Kibana, and Fleet are available and working Elasticsearch backend

backend elastic
  mode http
  server app-server <deployment-name>.es.psc.europe-west3.gcp.cloud.es.io:443 check ssl verify none
  http-request set-header X-Found-Cluster <deployment-name>.es
  http-response del-header X-Cloud-Request-Id
  http-response del-header X-Found-Handling-Cluster
  http-response del-header X-Found-Handling-Instance
  http-response del-header X-Found-Handling-Server

the Universal profiling HAProxy backend configuration

backend profiling
      mode http
      server app-server <deployment-name>.profiling.psc.europe-west3.gcp.cloud.es.io:443 check ssl verify none
      http-request set-header X-Found-Cluster <deployment-name>.profiling

trying to run pf-host-agen

pf-host-agent -project-id=1 -tags='cloud_region:europe-west3;env:staging' -secret-token=TOKEN  -collection-agent=profiling.domain.com:443 -v 

However, all attempts to enable Universal Profiling through HAProxy fail to connect the binary agent to the collector. I receive such an error:

level=debug msg="Sending host metadata..."
level=warning msg="Failed to report host metadata (retrying...): ReportHostMetadata failed: Unimplemented"

and in the responses, I see the following message:

grpc-message: unknown service collectionagent.CollectionAgent

Without Google Private Service Connect, and the agents were able to connect and send events.

cmacknz commented 11 months ago

This seems specific to pf-host-agent and not Elastic Agent, or maybe HAProxy, @SeanHeelan might have an idea or know where to route this issue.

b2ronn commented 11 months ago

With the Google Private Service Connect filter activated, I can't use the collection-agent endpoint that generates the kibana/agent. and through *profiling.psc.europe-west3.gcp.cloud.es.io:443 endpoint it is not possible to connect.

thomasdullien commented 11 months ago

Interesting. So from within the client VPC, the endpoint should be visible at profiling.psc.europe-west3.gcp.cloud.es.io:443, right?

Looking at the pf-host-agent command line:

pf-host-agent -project-id=1 \
  -tags='cloud_region:europe-west3;env:staging' \
  -secret-token=TOKEN  \
  -collection-agent=profiling.domain.com:443 -v 

The -collection-agent parameter should likely point to the domain profiling.psc...?

b2ronn commented 11 months ago

from the instructions https://www.elastic.co/guide/en/cloud/current/ec-traffic-filtering-psc.html when enabling Google Private Service Connect filtering, we must use the endpoints at https://{alias}.{product}.{private_hosted_zone_domain_name}.

Also, to use own domain for Elastic Cloud, we also need to configure a reverse proxy, as explained in https://www.elastic.co/guide/en/cloud/current/ec-regional-deployment-aliases.html#ec_setting_up_a_proxy. and we use our domain profiling.domain.com to access <alias>.profiling.psc.europe-west3.gcp.cloud.es.io

However, when it comes to profiling, it is challenging to route traffic when using a reverse proxy along with Google Private Service Connect.

b2ronn commented 10 months ago

do you have any news/recommendations?

thomasdullien commented 10 months ago

Digging into the docs, will reply shortly.

b2ronn commented 10 months ago

do you have any news/recommendations?

Aubermean commented 5 months ago

Did you manage to resolve this? Facing a similar problem

florianl commented 5 months ago

Since October 2023 new versions of the Elastic stack, including Universal Profiling, were released. Recently we also improved error handling when dealing with proxies. This is included with the 8.13 release. Please check out it out.