bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.01k stars 4.03k forks source link

Non-deterministic Arm64 crashes w/ ArrayIndexOutOfBoundsException in ActionInputMap #16974

Closed ulfjack closed 1 year ago

ulfjack commented 1 year ago

Description of the bug:

We're occasionally seeing Bazel crashes in our Arm64 CI pipeline, like this:

(14:48:12) FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.RuntimeException: Unrecoverable error while evaluating node 'ActionLookupData{actionLookupKey=ConfiguredTargetKey{label=//java/com/engflow/re:extract_classes, config=BuildConfigurationKey[0690f83395bc4bf234364953daa84306918ca3d66704a8628280ccc9a3dcb961]}, actionIndex=3}' (requested by nodes 'File:[[<execution_root>]bazel-out/aarch64-opt/internal]_middlemen/java_Scom_Sengflow_Sre_Sextract_Uclasses-runfiles')
    at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:665)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:382)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Index -1255515904 out of bounds for length 3568
    at com.google.devtools.build.lib.actions.ActionInputMap.putIfAbsent(ActionInputMap.java:332)
    at com.google.devtools.build.lib.actions.ActionInputMap.putWithNoDepOwner(ActionInputMap.java:310)
    at com.google.devtools.build.lib.actions.ActionInputMap.put(ActionInputMap.java:280)
    at com.google.devtools.build.lib.skyframe.ActionInputMapHelper.addToMap(ActionInputMapHelper.java:123)
    at com.google.devtools.build.lib.skyframe.ActionInputMapHelper.addToMap(ActionInputMapHelper.java:51)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.accumulateInputs(ActionExecutionFunction.java:1252)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkInputs(ActionExecutionFunction.java:1063)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:267)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:163)
    at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:591)
    ... 4 more

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

The crashes are non-deterministic. We haven't found a way to reproduce them reliably.

Which operating system are you running Bazel on?

Arm64

What is the output of bazel info release?

release 6.0.0-pre.20220608.2

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

I looked for ActionInputMap and ArrayIndexOutOfBoundsException, but did not find any relevant results.

Any other information, logs, or outputs that you want to share?

No response

zhengwei143 commented 1 year ago

@ulfjack Can you reproduce this at HEAD?

ulfjack commented 1 year ago

I have so far only seen it in CI (I don't have an Arm64 Linux machine immediately available). Looking through our CI results, I don't see any failures in the last few days - we're on 7.0.0-pre.20221212.2 now.