Closed pavelsavara closed 8 months ago
Tagging subscribers to 'arch-wasm': @lewing See info in area-owners.md if you want to be subscribed.
Author: | pavelsavara |
---|---|
Assignees: | pavelsavara |
Labels: | `arch-wasm`, `blocking-clean-ci`, `area-VM-threading-mono`, `Known Build Error`, `os-browser` |
Milestone: | 9.0.0 |
[22:52:59] info: [2024-01-22T22:52:59.062Z] [PASS] System.Threading.Tests.MonitorTests.IsEntered_Invalid
[22:52:59] warn: MONO_WASM [0xdc19c-main]: Failed to find loaded WebWorker, this may deadlock. Please increase the pthreadPoolSize. Running threads 25. Loading workers: 8
[22:52:59] warn: worker sent an error! http://127.0.0.1:39799/_framework/dotnet.native.worker.js:1: Uncaught RangeError: Invalid atomic access index
[22:52:59] fail: MONO_WASM: Error
at Ze (http://127.0.0.1:39799/_framework/dotnet.js:3:19870)
at e (http://127.0.0.1:39799/_framework/dotnet.js:3:33924)
at http://127.0.0.1:39799/_framework/dotnet.js:3:34069
[22:52:59] info: WASM EXIT 1
Not getting very far with trying to repro this locally. See https://github.com/dotnet/runtime/issues/96191#issuecomment-1912631321 for my methodology
[15:34:43] info: Starting: System.Threading.Tasks.Parallel.Tests.dll
[15:34:46] fail: [worker-idle 15:34:46.377] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[15:34:46] fail: [worker-idle 15:34:46.377] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[15:34:46] fail: [worker-idle 15:34:46.378] MONO_WASM: Invalid atomic access index
RangeError: Invalid atomic access index
at Atomics.waitAsync (<anonymous>)
at Object.__emscripten_thread_mailbox_await (http://127.0.0.1:39079/_framework/dotnet.native.js:8:102771)
at handleMessage (http://127.0.0.1:39079/_framework/dotnet.native.worker.js:1:2103)
[15:34:46] fail: [0x000dc33c 15:34:46.381] MONO_WASM: Uncaught RangeError: Invalid atomic access index
Error
at Object.Ke [as mono_exit] (http://127.0.0.1:39079/_framework/dotnet.js:3:20623)
at ft.config.exitOnUnhandledError.e.onerror (http://127.0.0.1:39079/_framework/dotnet.runtime.js:3:37630)
[15:34:46] warn: [0x000dc33c 15:34:46.384] MONO_WASM: mono_wasm_exit failed: unwind
[15:34:46] fail: [out of order message from the browser]: http://127.0.0.1:39079/_framework/dotnet.native.js 7:1295 Uncaught ErrorEvent: Uncaught RangeError: Invalid atomic access index
[15:34:46] fail: [out of order message from the browser]: http://127.0.0.1:39079/_framework/dotnet.js 2 Uncaught (in promise) RangeError: Invalid atomic access index
[14:20:00] info: [STRT] System.Runtime.InteropServices.JavaScript.Tests.WebWorkerHttpTest.HttpClient_ContentInDifferentThread(executor1: ThreadPool, executor2: JSWebWorker)
[14:20:00] info: [0x04cde380-jsww 14:20:00.231] MONO_WASM: forceDisposeProxies done: 0 imports, 0 exports, 0 GCHandles, 0 JSHandles.
[14:20:00] fail: [0x05410000-long=>detached 14:20:00.337] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[14:20:00] fail: [0x05410000-long=>detached 14:20:00.337] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[14:20:00] fail: [0x05410000-long=>detached 14:20:00.337] MONO_WASM: Invalid atomic access index
RangeError: Invalid atomic access index
at Atomics.waitAsync (<anonymous>)
at Object.__emscripten_thread_mailbox_await (http://127.0.0.1:44739/_framework/dotnet.native.js:8:99729)
at handleMessage (http://127.0.0.1:44739/_framework/dotnet.native.worker.js:1:2103)
[14:20:00] info: [STRT] System.Runtime.InteropServices.JavaScript.Tests.WebWorkerHttpTest.HttpClient_ContentInDifferentThread(executor1: NewThread, executor2: Main)
[14:20:00] fail: [0x000dd05c 14:20:00.346] MONO_WASM: Uncaught RangeError: Invalid atomic access index
Error
at Object.Ke [as mono_exit] (http://127.0.0.1:44739/_framework/dotnet.js:3:20623)
at ft.config.exitOnUnhandledError.e.onerror (http://127.0.0.1:44739/_framework/dotnet.runtime.js:3:37630)
[14:20:00] fail: [out of order message from the browser]: http://127.0.0.1:44739/_framework/dotnet.native.js 7:1295 Uncaught ErrorEvent: Uncaught RangeError: Invalid atomic access index
In both last cases it's the setup of await for next SAB emscripten messsage. It seems from the message prefix, that the thread is not attached to mono and possibly not even yet pthread. So this could be during worker becoming pthread.
Simplified emscripten code
var pthread_ptr = _pthread_self();
var wait = Atomics.waitAsync(GROWABLE_HEAP_I32(), pthread_ptr >> 2, pthread_ptr);
The method is updating the memory view via GROWABLE_HEAP_I32()
, so this is not memory growth problem.
The pthread_id in out console message prefix could be obsolete, from the last run of the worker as a pthread.
We are trying to await adress _pthread_self()
. I wonder how that could become un-aligned ?
Especially with pthread_ptr >> 2
being there.
The memory allocation for the pthread is (latest version, not the one we have)
There is some ASAN dance above it and I'm not sure if this is ASAN opt-out ?
Again, this seems to be thread which is not currently attached. Probably memory growth ?
08:44:17] info: [PASS][14] System.Transactions.Tests.NonMsdtcPromoterTests.PSPENonMSDTCWithDependentClones
[08:44:17] fail: [worker-idle 08:44:17.406] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[08:44:17] fail: [worker-idle 08:44:17.406] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[08:44:17] fail: [worker-idle 08:44:17.407] MONO_WASM: Invalid atomic access index
RangeError: Invalid atomic access index
at Atomics.waitAsync (<anonymous>)
at Object.__emscripten_thread_mailbox_await (http://127.0.0.1:33189/_framework/dotnet.native.js:8:102771)
at handleMessage (http://127.0.0.1:33189/_framework/dotnet.native.worker.js:1:2103)
[08:44:17] info: [0x000dc93c--UI- 08:44:17.415] MONO_WASM: Dumping web worker info as seen by UI thread, it could be stale:
[08:44:17] info: 00 | 0x000dc93c--UI-: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - UI Thread
[08:44:17] info: 01 | 0x02f9b1f8-norm: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - Finalizer
[08:44:17] info: 02 | 0x032df290-pool: isRunning: true isAttached: true isEventLoop: true reuseCount: 1 - .NET TP Worker
[08:44:17] info: 03 | 0x035ed010-gate: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET TP Gate
[08:44:17] info: 04 | 0x0436da68-pool: isRunning: true isAttached: true isEventLoop: true reuseCount: 1 - .NET TP Worker
[08:44:17] info: 05 | 0x03fe2048-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 06 | 0x04a80030-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 07 | 0x04c802e0-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 08 | 0x04f90040-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 09 | 0x052a0040-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 10 | 0x056c0040-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 11 | 0x058c02f0-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 12 | 0x05bd0040-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 2 - .NET Long Running Task
[08:44:17] info: 13 | 0x061303b0-timr: isRunning: true isAttached: true isEventLoop:false reuseCount: 2 - .NET Timer
[08:44:17] info: 14 | 0x06627ea8-pool: isRunning: true isAttached: true isEventLoop: true reuseCount: 1 - .NET TP Worker
[08:44:17] info: 15 | 0x06db0000-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 16 | 0x07069bc8-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 17 | 0x07370040-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 18 | 0x07680040-long: isRunning: true isAttached: true isEventLoop:false reuseCount: 1 - .NET Long Running Task
[08:44:17] info: 19 | - : isRunning:false isAttached:false isEventLoop:false reuseCount: 0 - emscripten-pool
[08:44:17] info: 20 | - : isRunning:false isAttached:false isEventLoop:false reuseCount: 0 - emscripten-pool
[08:44:17] info: 21 | --06ba8030-emsc: isRunning:false isAttached:false isEventLoop:false reuseCount: 1 - unregistered:(.NET Long Running Task)
[08:44:17] info: 22 | --06ba8030-emsc: isRunning:false isAttached:false isEventLoop:false reuseCount: 1 - unregistered:(.NET Long Running Task)
[08:44:17] info: 23 | - : isRunning:false isAttached:false isEventLoop:false reuseCount: 0 - emscripten-pool
[08:44:17] info: 24 | - : isRunning:false isAttached:false isEventLoop:false reuseCount: 0 - emscripten-pool
[08:44:17] info: 25 | - : isRunning:false isAttached:false isEventLoop:false reuseCount: 0 - emscripten-pool
[08:44:17] info: 26 | - : isRunning:false isAttached:false isEventLoop:false reuseCount: 0 - emscripten-pool
Error Blob
Reproduction Steps
Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=513268
Known issue validation
Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=513268 Error message validated:
RangeError: Invalid atomic access index
Result validation: :white_check_mark: Known issue matched with the provided build. Validation performed at: 1/5/2024 12:40:40 PM UTCReport
Summary