paketo-buildpacks / dynatrace

A Cloud Native Buildpack that contributes the Dynatrace OneAgent and configures it to connect to the service
Apache License 2.0
4 stars 4 forks source link

Intermittent Failures With ARM64 Builds #118

Open a1flecke opened 1 month ago

a1flecke commented 1 month ago

We have noticed intermittent failures with installing Dynatrace OneAgent when using this buildpacks to build ARM images and running on ARM. Debug and verbose logs do not reveal anything helpful to us. Removing the pack caching directives or issuing the clear-cache directive does not clear it up.

Expected Behavior

Dynatrace OneAgent is installed

Current Behavior

If it matters: We are exploring migrating our existing AMD64 services to ARM64. The other buildpacks are fine. The Dynatrace buildpack is the only one that is experiencing this issue.

Pack build command:

pack build my-app:a-tag\
      --builder paketobuildpacks/builder-jammy-buildpackless-tiny\
      --buildpack paketo-buildpacks/amazon-corretto,paketo-buildpacks/java,paketo-buildpacks/dynatrace\
      --cache-image ***REDACTED**cache:pack-my-app\
      --clear-cache \
      --env ARTIFACTORY_USERNAME=$ARTIFACTORY_USERNAME \
      --env ARTIFACTORY_TOKEN=$ARTIFACTORY_TOKEN\
      --env BP_JVM_VERSION=21\
      --env BP_MAVEN_BUILD_ARGUMENTS="-Dmaven.test.skip=true --no-transfer-progress package --batch-mode"\
      --env BP_MAVEN_BUILT_ARTIFACT=backend/target/\
      --env BPE_DEFAULT_BPL_JVM_CLASS_ADJUSTMENT=120% \
      --env BPL_JMX_ENABLED=true \
      --verbose \
      --platform=linux/arm64\
      --env BP_LOG_LEVEL='DEBUG'\
      --volume $HOME/.m2:/home/cnb/.m2:rw\
      --volume $PWD/bindings/:/platform/bindings/:rw\
      --sbom-output-dir sbom \
      --publish

Sometimes it fails:

Running build for buildpack paketo-buildpacks/dynatrace@5.8.1
Looking up buildpack
Finding plan
Creating plan directory
Preparing paths
Running build command
Paketo Buildpack for Dynatrace 5.8.1
  https://github.com/paketo-buildpacks/dynatrace
  Dynatrace OneAgent 1.293.153.20240702-150912: Contributing to layer
  Warning: Dependency has no SHA256. Skipping cache.
    Downloading from https://**REDACTED***.live.dynatrace.com/api/v1/deployment/installer/agent/unix/paas/latest?bitness=64&skipMetadata=true&arch=arm&include=java
unable to invoke layer creator
unable to get dependency Dynatrace OneAgent. see DEBUG log level
Timer: Builder ran for 2m16.490621239s and ended at 2024-07-18T21:05:27Z
ERROR: failed to build: exit status 1
ERROR: failed to build: executing lifecycle: failed with status code: 51
real    2m57.405s
user    0m21.847s
sys 0m2.489s

Sometimes it passes:

Running build for buildpack paketo-buildpacks/dynatrace@5.8.1
Looking up buildpack
Finding plan
Creating plan directory
Preparing paths
Running build command
Paketo Buildpack for Dynatrace 5.8.1
  https://github.com/paketo-buildpacks/dynatrace
  Dynatrace OneAgent 1.293.153.20240702-150912: Contributing to layer
  Warning: Dependency has no SHA256. Skipping cache.
    Downloading from https://***REDACTED***.live.dynatrace.com/api/v1/deployment/installer/agent/unix/paas/latest?bitness=64&skipMetadata=true&arch=arm&include=java
    Expanding to /layers/paketo-buildpacks_dynatrace/dynatrace-oneagent
    Writing env.launch/BPI_DYNATRACE_BUILDPACK_ID.default
    Writing env.launch/BPI_DYNATRACE_BUILDPACK_VERSION.default
    Writing env.launch/DT_CUSTOM_PROP.append
    Writing env.launch/DT_CUSTOM_PROP.delim
    Writing env.launch/DT_LOGSTREAM.default
    Writing env.launch/LD_PRELOAD.delim
    Writing env.launch/LD_PRELOAD.prepend
  Launch Helper: Contributing to layer
    Creating /layers/paketo-buildpacks_dynatrace/helper/exec.d/properties
Processing layers
Updating environment
Reading output files
Updating buildpack processes
Updating process list
Finished running build for buildpack paketo-buildpacks/dynatrace@5.8.1

Possible Solution

Not sure

Steps to Reproduce

Motivations

dmikusa commented 1 month ago
  Warning: Dependency has no SHA256. Skipping cache.
    Downloading from https://**REDACTED***.live.dynatrace.com/api/v1/deployment/installer/agent/unix/paas/latest?bitness=64&skipMetadata=true&arch=arm&include=java
unable to invoke layer creator
unable to get dependency Dynatrace OneAgent. see DEBUG log level

There is no pre-bundled archive we can use with Dynatrace. We always have to fetch the archive from Dynatrace and then install it. What's happening here is that the buildpack is attempting to fetch the dependency from Dynatrace and it's failing to download. It could be network-related, or it could be related to the server responding slowly.

When you pack build --env BP_LOG_LEVEL='DEBUG' -v ... does it give you any more details?

a1flecke commented 1 month ago
  Warning: Dependency has no SHA256. Skipping cache.
    Downloading from https://**REDACTED***.live.dynatrace.com/api/v1/deployment/installer/agent/unix/paas/latest?bitness=64&skipMetadata=true&arch=arm&include=java
unable to invoke layer creator
unable to get dependency Dynatrace OneAgent. see DEBUG log level

There is no pre-bundled archive we can use with Dynatrace. We always have to fetch the archive from Dynatrace and then install it. What's happening here is that the buildpack is attempting to fetch the dependency from Dynatrace and it's failing to download. It could be network-related, or it could be related to the server responding slowly.

When you pack build --env BP_LOG_LEVEL='DEBUG' -v ... does it give you any more details?

The above logs were with debug and verbose logging. There were no helpful details.

dmikusa commented 1 month ago

So, it's hard to say. It seems network or related to the dynatrace server response. You can try adjusting the timeout settings, and see if that might help. They correlate to various Go HTTP Client timeouts. I put some info below.

I could perhaps see BP_RESPONSE_HEADER_TIMEOUT being too short if the server is taking a while to respond. Maybe also BP_DIALER_TIMEOUT, if it's not getting a quick TCP connection (I believe DNS resolution is also included in this, so if DNS is slow that can throw off the connect timeout too).

Hope that helps!