Open sunishdahiya opened 3 years ago
@sunishdahiya Nice to meet you, and thank you for reporting this bug.
Based on what I see so far and your description, this looks like a classic example of asynchronous transaction state conflation. Because of Node's asynchronous execution model the Agent can sometimes loose track of what the current transaction is. This can be especially true if you have
ELASTIC_APM_ASYNC_HOOKS=false
set to false and you're using a DB package or HTTP fetching package that we don't explicitly instrument.
We have a few questions that will better help us understand this.
ELASTIC_APM_ASYNC_HOOKS=false
? If you set this to true
does the problem remain?With that information in hand we should be able to better diagnose this issue and advise.
Hi @astorm , Thanks for the quick response
We have set ELASTIC_APM_ASYNC_HOOKS
to false because in the documentation it is mentioned that ASYNC_HOOKS will have no effect if the node version is older than 8.2.0 and all our services use node version 8.1.0.
"sequelize": "^3.19.3" and "mongoose": "~4.5.9"
."request-promise": "^4.2.4"
startTransaction
API only to capture kafka-events. Code for that is given below
const recordTransaction = async (transactionType, transactionName, transactionHandler, ...transactionHandlerArgs) => {
const transaction = apm.startTransaction(transactionName, transactionType)
try{
await transactionHandler(...transactionHandlerArgs);
transaction.end()
} catch(error) {
await apm.captureError(error)
transaction.end()
throw error
}
}
@astorm Any update on this?
I am using default value for asyncHooks, which documentations says its true and I am seeing wrong transaction traces
With version 3.24.0 of the agent there was a significant re-work of the async run context handling that is largely responsible for these kinds of issues.
However, often it really comes down to specifics. @williamoliveira I would encourage you to open a separate issue with as much detail as you can provide. The ideal is a repro case showing the issue.
@sunishdahiya Sorry for the delay. With the above information I'm not able to confirm whether the changes in v3.24.0 of the agent fix the issue for you. Are you able to retry with the latest agent version? Then we could go from there?
await Promise.resolve(); helps to workaround this issue. Just place it before the manual transaction/span creation.
Describe the bug We have integrated the APM in our microservice architecture making use of distributed tracing feature. But we are running into a issue that often child transactions and spans end up on the wrong parent transaction. This happens mostly when requests come simultaneously(concurrently).
Example trace having the above mentioned issue:
In the above trace the
/getIntersectingHubs
API call shouldn't be there as inapi/alerting/getAlerts
we never call this API.Expected behavior There shouldn't be any extra API or database call in the distributed trace.
Given below is the correct distributed trace for
api/alerting/getAlerts
API.Environment (please complete the following information)
How are you starting the agent? (please tick one of the boxes)
agent.start()
directly (e.g.require('elastic-apm-node').start(...)
)elastic-apm-node/start
from within the source code-r elastic-apm-node/start
Additional context
ELASTIC_APM_ENABLED=true ELASTIC_APM_SERVER_URL="http://:8200"
ELASTIC_APM_SERVICE_NAME="$SERVICE_ID-$ENV"
ELASTIC_APM_ASYNC_HOOKS=false
ELASTIC_APM_CAPTURE_BODY="off"
ELASTIC_APM_TRANSACTION_SAMPLE_RATE=0.1
ELASTIC_APM_CAPTURE_EXCEPTIONS=true
ELASTIC_APM_CAPTURE_ERROR_LOG_STACK_TRACES="messages"
ELASTIC_APM_CAPTURE_SPAN_STACK_TRACES=false
ELASTIC_APM_SOURCE_LINES_SPAN_APP_FRAMES=0
ELASTIC_APM_SOURCE_LINES_SPAN_LIBRARY_FRAMES=0
ELASTIC_APM_SOURCE_LINES_ERROR_APP_FRAMES=0
ELASTIC_APM_SOURCE_LINES_ERROR_LIBRARY_FRAMES=0
ELASTIC_APM_ERROR_MESSAGE_MAX_LENGTH="1kb"
ELASTIC_APM_STACK_TRACE_LIMIT=0
ELASTIC_APM_TRANSACTION_MAX_SPANS=100
ELASTIC_APM_API_REQUEST_TIME="10s"
ELASTIC_APM_API_REQUEST_SIZE="768kb"
ELASTIC_APM_METRICS_INTERVAL="30s"