aws / aws-xray-java-agent

The official AWS X-Ray Auto Instrumentation Agent for Java.
Apache License 2.0
42 stars 18 forks source link

AWS Xray Java auto instrumentation using Agent Jar not recording all the requests #64

Closed jijokrishnan33 closed 3 years ago

jijokrishnan33 commented 3 years ago

I have configured the AWS Xary using Java agent in ECS. But Xray automatic instrumentation using java agent only records the health check requests. All the requests which are going through ALB is not getting recognized by the Java agent. There is no sampling rule set to restrict the requests coming from ALB. I have tried to hit the service directly using the container IP address instead of ALB URL and that requests are getting recognized by the Xray Agent.

Following is the container definition for Xray daemon. { "Name": "xray-daemon", "Image": {"Ref": "XrayImageURL01"}, "PortMappings" : [ { "hostPort": 0, "containerPort": 2000, "protocol": "udp" } ], "LogConfiguration": { "LogDriver": "awslogs", "Options": { "awslogs-group": { "Ref": "ContainerCWLogGroup01" }, "awslogs-region": { "Ref": "AWS::Region" }, "awslogs-stream-prefix": {"Fn::Sub": "${MicroserviceId01}"} } } }

The agent jar is placed inside the container and following is the entry point script to add the java agent while starting the application.java -javaagent:/usr/app/disco/disco-java-agent.jar=pluginPath=/usr/app/disco/disco-plugins:loggerfactory=software.amazon.disco.agent.reflect.logging.StandardOutputLoggerFactory:verbose -jar test-0.0.1-SNAPSHOT.jar

No other code level changes are applied for XRay in the application which is deployed in ECS.

willarmiros commented 3 years ago

Hi @jijokrishnan33,

This sounds like it could be a sampling issue. Could you try to follow these docs to either: 1) ignore all the health check requests (set sampling for the health check URL to 0%) or 2) sample 100% of all requests (to ensure health + normal requests are sampled). The default sampling rules only sample 1 request per second (plus 5% of additional request), so if a health check and normal request come in at the same time, then it is very likely the health check will be sampled but the subsequent request won't.

If this doesn't work, please enable debug logging and post your debug logs.

jijokrishnan33 commented 3 years ago

Hi @willarmiros ,

I have set the sampling rule as 100% for all request but still only the health check request are getting recorded in Xray. None of the request executed via ALB is getting recorded. I have attached the debug log for both Xray daemon and application. Logs.zip

espower commented 3 years ago

Having same exact issue. bumped to 100% in sampling rules. A direct request to the container http://10.x.x.x:1234/some/unique/path always generates a trace but the exact same request through the ALB for the ECS service never generates a trace. ELB healcheck from target group always generates an trace.

Only have 1 container in service and can validate the both manual GET requests are reaching the same container via CW log entries.

willarmiros commented 3 years ago

@espower @jijokrishnan33 thank you for the bug reports, it turned out to be a problem on the agent's end. I've opened #66 to address this and it should fix this problem once released.

jijokrishnan33 commented 3 years ago

@willarmiros ,

When will be the fix be available for us to use?

willarmiros commented 3 years ago

Hi @jijokrishnan33 I cannot give official dates for releases, but I will try to get one out soon! Feel free to watch the repo to be notified when it's out :)

willarmiros commented 3 years ago

This bug has been fixed in v2.8.0, download today! https://github.com/aws/aws-xray-java-agent/releases/tag/v2.8.0