Open siosphere opened 1 year ago
We wish to dig this down, but as you may be aware it is extremely non trivial to debug these kind of intermittent issue without having an isolated repro codes.
We'll try to look into time to time, but probably will be blocked by no repro.
We wish to dig this down, but as you may be aware it is extremely non trivial to debug these kind of intermittent issue without having an isolated repro codes.
We'll try to look into time to time, but probably will be blocked by no repro.
Understandable. I'm wondering if there is a way to get more debug output out of swc. I could setup our CI/CD pipeline to run with more verbose output if it is available. (I'm not familiar enough with rust to go poking around and adding debug statements, but if a verbose option doesn't currently exist I may attempt that)
It is not easy, but if you could run this with debug build it maybe helpful. But it means
Since it is happening daily, it is worth it for me to go down that path.
I'll take a look at the build directions for swc and get a custom debug build setup, I'll update this issue with any findings.
Thanks,
Some additional output:
@coursehero/html-viewer:test: thread '<unnamed>' panicked at 'range end index 7551504 out of range for slice of length 1626080', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/wasmer-engine-universal-2.3.0/src/artifact.rs:102:38
@coursehero/html-viewer:test: note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Can you try RUST_BACKTRACE=1
?
This issue has been automatically closed because it received no activity for a month and had no reproduction to investigate. If you think this was closed by accident, please leave a comment. If you are running into a similar issue, please open a new issue with a reproduction. Thank you.
no activity for a month
🤔 this seems not correct
@kdy1 some ideas? Issue is opened a wee ago, and last comment was 6 days ago so there's physically no way this issue can be a month
old without activity.
The word in comment is wrong Just remove need more info label
Finally was able to get some more time to look at this, and got an error with backtrace available:
thread '<unnamed>' panicked at 'range end index 7551504 out of range for slice of length 6303712', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/wasmer-engine-universal-2.3.0/src/artifact.rs:102:38
Stack backtrace:
0: <unknown>
1: <unknown>
2: <unknown>
3: <unknown>
4: <unknown>
5: _ZN6v8impl12_GLOBAL__N_123FunctionCallbackWrapper6InvokeERKN2v820FunctionCallbackInfoINS2_5ValueEEE
6: _ZN2v88internal12_GLOBAL__N_119HandleApiCallHelperILb0EEENS0_11MaybeHandleINS0_6ObjectEEEPNS0_7IsolateENS0_6HandleINS0_10HeapObjectEEESA_NS8_INS0_20FunctionTemplateInfoEEENS8_IS4_EENS0_16BuiltinArgumentsE
7: _ZN2v88internal21Builtin_HandleApiCallEiPmPNS0_7IsolateE
8: Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit
re-running the job and it passes just fine. And it is not always the same test or even the same package that errors
Dropping in to link https://github.com/swc-project/plugins/issues/42, which I believe has a similar root cause. I was not able to get the debug info previously (thanks for doing the hard work on that @siosphere!), but I'm seeing a little more detail on my end after updating to the latest swc:
thread '<unnamed>' panicked at 'range end index 6294120 out of range for slice of length 2158560', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/wasmer-engine-universal-2.3.0/src/artifact.rs:102:38
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Note the wasmer-engine-universal-2.3.0/src/artifact.rs:102:38
was not part of the original stack trace for my issue or this one, but seems to perfectly match with what @siosphere reported above. Not positive, but I'm guessing it points to this line from the wasmer 2.3.0 release? I don't see anything obviously related there, but hopefully @kdy1 or @kwonoj can spot something 🤞
And yes, this happens if metadata_len
is too large
@kdy1 thanks for verifying the source. Is this something the SWC team will be able to look into, or should we upstream the issue to wasmer?
It's not a something we can look into
is there any solution we can do for this issue? I'm facing the same problem
thread '<unnamed>' panicked at 'range end index 8811528 out of range for slice of length 7540704', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/wasmer-engine-universal-2.3.0/src/artifact.rs:102:38
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
always with jest, not sure why!
● Test suite failed to run
failed to handle: range end index 8811528 out of range for slice of length 7540704
I had originally been under the impression this was only happening on some specific docker images (much older images), and that the errors weren't present on ubuntu:latest
in GitHub Actions. We recently bumped our actions runners to much larger instances with 16 cores, and are starting to see these errors in that environment as well. From this, it seems that running many concurrent instances of swc + jest in parallel leads to this issue.
I can relate to that. I tried using the GitHub Actions runner and everything went smoothly. However, when using our self-hosted runner in parallel mode, we encountered some issues. To mitigate the problem, I tried running it with the --runInBand
option, and while the issue still occurred, it became less frequent.
Also getting this issue intermittently, sometimes resolved by deleting .swc
directory to re-download the loadable plugin
Has anyone got any resolution for this issue? @ezpuzz @dgreif @HashemKhalifa
@kdy1 Do we have any updates/workaround on this. I have a very large monorepo which fails the CI pipes intermittently, giving the following error:
These are the current versions we are using
"@swc/core": "^1.3.46",
"@swc/jest": "^0.2.24",
@samar-1601 which jest version are you using and react-testing-library
? I
It was a memory leak that happened while using an older version of jest and react-testing-library
that you had to use a different way of handling the events.
here is what I updated and adjust my tests to match the new updates.
"@swc-node/register": "1.8.0",
"@swc/cli": "0.1.62",
"@swc/core": "1.3.62",
"@swc/helpers": "0.5.1",
"@swc/jest": "0.2.26",
"@swc/plugin-jest": "^1.5.67",
"@swc/plugin-loadable-components": "0.3.67",
"@swc/plugin-styled-components": "1.5.67",
"jest": "^29.7.0",
"jest-axe": "^8.0.0",
"jest-canvas-mock": "^2.5.2",
"jest-each": "^29.7.0",
"jest-environment-jsdom": "^29.7.0",
"jest-localstorage-mock": "^2.4.26",
"jest-styled-components": "^7.1.1",
"jest-transform-stub": "^2.0.0",
"jest-watch-typeahead": "^2.2.2",
"jest_workaround": "^0.76.0",
"jsdom": "^22.1.0",
"jsonc-eslint-parser": "^2.1.0",
"@testing-library/dom": "^9.3.3",
"@testing-library/jest-dom": "^6.1.4",
"@testing-library/react": "^14.0.0",
"@testing-library/user-event": "^14.4.3",
@HashemKhalifa We are using the following versions.
"jest": "26.6.3",
"jest_workaround": "^0.72.6",
It was a memory leak that happened while using an older version of jest and react-testing-library that you had to use a different way of handling the events.
Is this related in any way to these node+jest version issues.
We have tried using various combinations of maxWorkers=(some percentage), -w=(some number of threads). But none of the memory management steps solves the problem in a concrete manner i.e. this range end index
error keeps on coming intermittently.
I have found possible solutions(if the problem is related to jest+node version issues) which includes upgrading node to 21.1 and using workerIdleMemoryLimit(mentioned in the official jest documentation).
The problem is I am stuck in a 2 way here, cause the solutions
Both will take a huge effort considering the size of our monorepo and currently, we have our pipes failing intermittently on a daily basis.
I feel you, been in your shoes, and that's correct with the memory leak issues you mentioned, I had to upgrade both, since then all good, and currently planning to move away from jest to vitest as well.
@HashemKhalifa Did you use the workerIdleMemoryLimit
feature to restrict the memory as well? Cause in our case, upgrading node to 21.1 as mentioned in https://github.com/jestjs/jest/issues/11956 didn't solve our problem, instead it became a recurring issue. Hoping the jest upgrade to 29.7 and workerIdleMemoryLimit
solves our problem.
Will update the thread with our findings.
I have tried workerIdleMemoryLimit
, but it didn't work as I expected, which react version are you using?
@HashemKhalifa React version is 18.2.0
So if workerIdleMemoryLimit
didn't work, correct me if I'm wrong, just upgrading node and jest solved the issue?
@samar-1601 no, was one of the reasons that caused memory leaks, as well in our test cases there were so many tests that I had to adjust to fix the leaks, some of which had a relation with react-testing-library
packages. like @testing-library/jest-dom
and @testing-library/user-event
I hope that answers your question
Removing the milestone as there's no repro.
@HashemKhalifa Another observation, using flags no-compilation-cache
and expose-gc
reduce the memory consumption of the jest test cases.
Here is the result of a sample run over one of our modules(jest v26.4
):
- node v16.10
- Without `no-compilation-cache`, no `expose-gc` → 900+ MB
- With the above, 580 MB (259 sec)
- node v18.2 (clearly has leaks)
- Without `no-compilation-cache`, no `expose-gc` → 1300+ MB
- With the above → 3000+ MB (then I manually stopped it)
- node v21.1.0
- Without `no-compilation-cache`, no `expose-gc` → 1200+ MB (150 sec)
- With the above → around 220-230 MB avg (163 sec)
We too see this consistently. Are there any other diagnostic steps you can recommend? We're running these builds on C7s (xl I believe) so it's hard to believe the issue is related to memory exhaustion. Any info you can give about the error that might help us understand the issue? Is this reading compiled code from a cache? Or reading the compilation response?
It's a long shot but considering the worker model of jest is it possible that multiple workers are requesting the same file/module to be compiled at the same time?
Yeah, this may be related to any issue when jest runs with multiple workers cause the error doesn't come when we use --runInBand
(--runInBand
is synonymous to -w=1
i.e 1 worker thread). But the problem is that we can't go ahead with one worker only because of the time taken for the tests to execute.
For instance, a process which takes 8-9s to run with maxWorkers=50%
, 22s for -w=4
, takes 98-100s with --runInband
.
@dgreif @kdy1 any other insights?
--runInBand
is still the only solution we have found which consistently avoids this error. We use nx
for build parallelization and caching in our monorepo, which allows us to get most of the benefits provided by multiple Jest workers, but we do have a few big packages which still run slowly with --runInBand
and delay the overall test run from finishing. I wish I had more insight for y'all, but that's all I've got 😅
@dgreif Can you let us know the node, jest versions you are using. We have upgraded to v21.1.0
but unlike mentioned in https://github.com/jestjs/jest/issues/11956, this didn't solve our problem. So, we are now going down the line of upgrading jest to v29.7
and fix tests so as to use the --maxIdleWorkerMemoryLimit
. Not sure weather this will solve the problem or not :/
@HashemKhalifa can you guide a bit as to how you detected leaks and then fixed them.
For us, currently using --no-compilation-cache
seems to fix the memory pile up issue.
Here is a sample for the same:
Flag | Without | With |
---|---|---|
--no-compilation-cache |
I can share my findings while working on this issue last year maybe it could help:
Updating Node reduced the leaks but didn't solve the problem completely https://github.com/jestjs/jest/issues/11956 https://issues.chromium.org/issues/42202158
Updating Jest still didn't solve the problem entirely but reduced how often it happened.
I'm not sure if it's related to SWC, but it definitely exacerbates the issue in Jest, I was hoping --workerIdleMemoryLimit
would solve the issue but it didn't have any effect.
As @dgreif mentioned, there's only one option, which is runInBand, because Jest instances with memory leaks are not able to shut down and keep running forever until SWC complains.
Our solution:
Run tests in-band (sequentially) within each Jest instance. Keep parallelization at a higher level using NX, running multiple Jest instances concurrently. This approach isolates each test instance, preventing memory leaks from spreading, while still leveraging parallel execution for improved overall performance.
Describe the bug
This is an intermittent bug with @swc/jest and/or @swc/core.
We are running our test suite with turbo repo, utilizing @swc/jest, and intermittently we get a failure like this:
Retrying the ci/cd job and it will succeed just fine.
This does not happen very often, but ~1/50 runs. That error message appears to be from rust, which is pointing me more towards it being an @swc issue, than a jest specific issue.
Input code
No response
Config
Playground link
No response
Expected behavior
Test suites should not intermittently fail with no changes
Actual behavior
Fails intermittently
Version
1.2.128
Additional context
"@swc/cli": "^0.1.57", "@swc/core": "^1.2.128", "@swc/jest": "^0.2.15",