dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.26k stars 4.73k forks source link

[browser][mt] RangeError: Invalid atomic access index #96545

Closed pavelsavara closed 8 months ago

pavelsavara commented 10 months ago

Error Blob

{
  "ErrorMessage": "RangeError: Invalid atomic access index",
  "BuildRetry": false,
  "ErrorPattern": "",
  "ExcludeConsoleLog": false
}

Reproduction Steps

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=513268

Known issue validation

Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=513268 Error message validated: RangeError: Invalid atomic access index Result validation: :white_check_mark: Known issue matched with the provided build. Validation performed at: 1/5/2024 12:40:40 PM UTC

Report

Build Definition Test Pull Request
569971 dotnet/runtime WasmTestOnBrowser-System.Transactions.Local.Tests.WorkItemExecution dotnet/runtime#98492
567613 dotnet/runtime WasmTestOnBrowser-System.Threading.Tasks.Dataflow.Tests.WorkItemExecution dotnet/runtime#98482
566720 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution
565264 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Configuration.Tests.WorkItemExecution dotnet/runtime#98199
565182 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.DependencyInjection.Tests.WorkItemExecution dotnet/runtime#98336
564782 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution dotnet/runtime#98336
563571 dotnet/runtime WasmTestOnBrowser-System.Net.Http.Functional.Tests.WorkItemExecution dotnet/runtime#98128
563337 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution
562985 dotnet/runtime WasmTestOnBrowser-System.IO.Hashing.Tests.WorkItemExecution dotnet/runtime#97213
562540 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution dotnet/runtime#97508
560637 dotnet/runtime WasmTestOnBrowser-System.Threading.Tests.WorkItemExecution dotnet/runtime#98230
559787 dotnet/runtime WasmTestOnBrowser-System.Net.Http.Json.Functional.Tests.WorkItemExecution
559039 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution
556974 dotnet/runtime WasmTestOnBrowser-System.Runtime.InteropServices.JavaScript.Tests.WorkItemExecution dotnet/runtime#95001
555167 dotnet/runtime WasmTestOnBrowser-System.Threading.Tasks.Parallel.Tests.WorkItemExecution dotnet/runtime#98027
555069 dotnet/runtime WasmTestOnBrowser-System.Runtime.InteropServices.JavaScript.Tests.WorkItemExecution dotnet/runtime#97021
549104 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.HostFactoryResolver.Tests.WorkItemExecution dotnet/runtime#97560
549003 dotnet/runtime WasmTestOnBrowser-System.Runtime.InteropServices.JavaScript.Tests.WorkItemExecution dotnet/runtime#97806
548143 dotnet/runtime WasmTestOnBrowser-System.Runtime.InteropServices.JavaScript.Tests.WorkItemExecution dotnet/runtime#94764
540864 dotnet/runtime WasmTestOnBrowser-System.Net.Http.Json.Functional.Tests.WorkItemExecution dotnet/runtime#97441
536247 dotnet/runtime WasmTestOnBrowser-System.Threading.Tests.WorkItemExecution dotnet/runtime#97270

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
1 10 21
ghost commented 10 months ago

Tagging subscribers to 'arch-wasm': @lewing See info in area-owners.md if you want to be subscribed.

Issue Details
### Error Blob ```json { "ErrorMessage": "RangeError: Invalid atomic access index", "BuildRetry": false, "ErrorPattern": "", "ExcludeConsoleLog": false } ``` ### Reproduction Steps Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=513268
Author: pavelsavara
Assignees: pavelsavara
Labels: `arch-wasm`, `blocking-clean-ci`, `area-VM-threading-mono`, `Known Build Error`, `os-browser`
Milestone: 9.0.0
pavelsavara commented 9 months ago

Log

[22:52:59] info: [2024-01-22T22:52:59.062Z] [PASS] System.Threading.Tests.MonitorTests.IsEntered_Invalid
[22:52:59] warn: MONO_WASM [0xdc19c-main]: Failed to find loaded WebWorker, this may deadlock. Please increase the pthreadPoolSize. Running threads 25. Loading workers: 8
[22:52:59] warn: worker sent an error! http://127.0.0.1:39799/_framework/dotnet.native.worker.js:1: Uncaught RangeError: Invalid atomic access index
[22:52:59] fail: MONO_WASM: Error
                     at Ze (http://127.0.0.1:39799/_framework/dotnet.js:3:19870)
                     at e (http://127.0.0.1:39799/_framework/dotnet.js:3:33924)
                     at http://127.0.0.1:39799/_framework/dotnet.js:3:34069
[22:52:59] info: WASM EXIT 1
lambdageek commented 9 months ago

Not getting very far with trying to repro this locally. See https://github.com/dotnet/runtime/issues/96191#issuecomment-1912631321 for my methodology

pavelsavara commented 9 months ago

Log Build

[15:34:43] info: Starting:    System.Threading.Tasks.Parallel.Tests.dll
[15:34:46] fail: [worker-idle 15:34:46.377] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[15:34:46] fail: [worker-idle 15:34:46.377] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[15:34:46] fail: [worker-idle 15:34:46.378] MONO_WASM: Invalid atomic access index
                 RangeError: Invalid atomic access index
                     at Atomics.waitAsync (<anonymous>)
                     at Object.__emscripten_thread_mailbox_await (http://127.0.0.1:39079/_framework/dotnet.native.js:8:102771)
                     at handleMessage (http://127.0.0.1:39079/_framework/dotnet.native.worker.js:1:2103)
[15:34:46] fail: [0x000dc33c 15:34:46.381] MONO_WASM: Uncaught RangeError: Invalid atomic access index
                 Error
                     at Object.Ke [as mono_exit] (http://127.0.0.1:39079/_framework/dotnet.js:3:20623)
                     at ft.config.exitOnUnhandledError.e.onerror (http://127.0.0.1:39079/_framework/dotnet.runtime.js:3:37630)
[15:34:46] warn: [0x000dc33c 15:34:46.384] MONO_WASM: mono_wasm_exit failed: unwind
[15:34:46] fail: [out of order message from the browser]: http://127.0.0.1:39079/_framework/dotnet.native.js 7:1295 Uncaught ErrorEvent: Uncaught RangeError: Invalid atomic access index
[15:34:46] fail: [out of order message from the browser]: http://127.0.0.1:39079/_framework/dotnet.js 2 Uncaught (in promise) RangeError: Invalid atomic access index
pavelsavara commented 9 months ago

Log Build

[14:20:00] info: [STRT] System.Runtime.InteropServices.JavaScript.Tests.WebWorkerHttpTest.HttpClient_ContentInDifferentThread(executor1: ThreadPool, executor2: JSWebWorker)
[14:20:00] info: [0x04cde380-jsww 14:20:00.231] MONO_WASM: forceDisposeProxies done: 0 imports, 0 exports, 0 GCHandles, 0 JSHandles.
[14:20:00] fail: [0x05410000-long=>detached 14:20:00.337] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[14:20:00] fail: [0x05410000-long=>detached 14:20:00.337] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[14:20:00] fail: [0x05410000-long=>detached 14:20:00.337] MONO_WASM: Invalid atomic access index
                 RangeError: Invalid atomic access index
                     at Atomics.waitAsync (<anonymous>)
                     at Object.__emscripten_thread_mailbox_await (http://127.0.0.1:44739/_framework/dotnet.native.js:8:99729)
                     at handleMessage (http://127.0.0.1:44739/_framework/dotnet.native.worker.js:1:2103)
[14:20:00] info: [STRT] System.Runtime.InteropServices.JavaScript.Tests.WebWorkerHttpTest.HttpClient_ContentInDifferentThread(executor1: NewThread, executor2: Main)
[14:20:00] fail: [0x000dd05c 14:20:00.346] MONO_WASM: Uncaught RangeError: Invalid atomic access index
                 Error
                     at Object.Ke [as mono_exit] (http://127.0.0.1:44739/_framework/dotnet.js:3:20623)
                     at ft.config.exitOnUnhandledError.e.onerror (http://127.0.0.1:44739/_framework/dotnet.runtime.js:3:37630)
[14:20:00] fail: [out of order message from the browser]: http://127.0.0.1:44739/_framework/dotnet.native.js 7:1295 Uncaught ErrorEvent: Uncaught RangeError: Invalid atomic access index
pavelsavara commented 9 months ago

In both last cases it's the setup of await for next SAB emscripten messsage. It seems from the message prefix, that the thread is not attached to mono and possibly not even yet pthread. So this could be during worker becoming pthread.

Simplified emscripten code

var pthread_ptr = _pthread_self();
var wait = Atomics.waitAsync(GROWABLE_HEAP_I32(), pthread_ptr >> 2, pthread_ptr);

The method is updating the memory view via GROWABLE_HEAP_I32(), so this is not memory growth problem.

The pthread_id in out console message prefix could be obsolete, from the last run of the worker as a pthread.

We are trying to await adress _pthread_self(). I wonder how that could become un-aligned ? Especially with pthread_ptr >> 2 being there.

pavelsavara commented 9 months ago

The memory allocation for the pthread is (latest version, not the one we have)

https://github.com/emscripten-core/emscripten/blob/21f25339df9c6aa8c288297de73236e6aaa79582/system/lib/pthread/pthread_create.c#L165-L172

There is some ASAN dance above it and I'm not sure if this is ASAN opt-out ?

pavelsavara commented 8 months ago

Again, this seems to be thread which is not currently attached. Probably memory growth ?

08:44:17] info: [PASS][14] System.Transactions.Tests.NonMsdtcPromoterTests.PSPENonMSDTCWithDependentClones
[08:44:17] fail: [worker-idle 08:44:17.406] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[08:44:17] fail: [worker-idle 08:44:17.406] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[08:44:17] fail: [worker-idle 08:44:17.407] MONO_WASM: Invalid atomic access index
                 RangeError: Invalid atomic access index
                     at Atomics.waitAsync (<anonymous>)
                     at Object.__emscripten_thread_mailbox_await (http://127.0.0.1:33189/_framework/dotnet.native.js:8:102771)
                     at handleMessage (http://127.0.0.1:33189/_framework/dotnet.native.worker.js:1:2103)
[08:44:17] info: [0x000dc93c--UI- 08:44:17.415] MONO_WASM: Dumping web worker info as seen by UI thread, it could be stale:
[08:44:17] info: 00 | 0x000dc93c--UI-: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - UI Thread
[08:44:17] info: 01 | 0x02f9b1f8-norm: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - Finalizer
[08:44:17] info: 02 | 0x032df290-pool: isRunning: true isAttached: true isEventLoop: true reuseCount:  1 - .NET TP Worker
[08:44:17] info: 03 | 0x035ed010-gate: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET TP Gate
[08:44:17] info: 04 | 0x0436da68-pool: isRunning: true isAttached: true isEventLoop: true reuseCount:  1 - .NET TP Worker
[08:44:17] info: 05 | 0x03fe2048-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 06 | 0x04a80030-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 07 | 0x04c802e0-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 08 | 0x04f90040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 09 | 0x052a0040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 10 | 0x056c0040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 11 | 0x058c02f0-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 12 | 0x05bd0040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  2 - .NET Long Running Task
[08:44:17] info: 13 | 0x061303b0-timr: isRunning: true isAttached: true isEventLoop:false reuseCount:  2 - .NET Timer
[08:44:17] info: 14 | 0x06627ea8-pool: isRunning: true isAttached: true isEventLoop: true reuseCount:  1 - .NET TP Worker
[08:44:17] info: 15 | 0x06db0000-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 16 | 0x07069bc8-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 17 | 0x07370040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 18 | 0x07680040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 19 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 20 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 21 | --06ba8030-emsc: isRunning:false isAttached:false isEventLoop:false reuseCount:  1 - unregistered:(.NET Long Running Task)
[08:44:17] info: 22 | --06ba8030-emsc: isRunning:false isAttached:false isEventLoop:false reuseCount:  1 - unregistered:(.NET Long Running Task)
[08:44:17] info: 23 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 24 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 25 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 26 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool