Unnecessary memory use in some tests causes false failures on constrained hardware

phoddie commented 2 years ago

The XS JavaScript engine is designed to run on resource constrained embedded hardware. It runs EcmaScript 2021 on devices with as little as 16 KB of RAM.

We have typically run Test262 against XS running on a computer where there is no practical limit on the memory available to the tests. We have slowly been working on running Test2652 on embedded devices as well. Our current focus is the ESP32 where the default VM size we test with is 64 KB. Our hope is to eventually run Test262 on devices with less memory, so this is a first step.

As expected, some tests fail because of memory exhaustion. That's probably inevitable for some tests. But, there are tests which would pass if they required less memory. A good example is this test Array.prototoype.copyWithin:

https://github.com/tc39/test262/blob/9f2814f00ff7612f74024c15c165853b6765c7ab/test/built-ins/TypedArray/prototype/copyWithin/coerced-values-end-detached-prototype.js#L34-L40

Here the 10,000 element array causes an allocation failure of the TypedArray on ESP32 which prevents the test from completing. Reducing this to 1000 allows the test to pass. The larger array size does not appear to contribute to validating conformance with the standard. It would be beneficial for developers using XS to know that it implements the correct behavior, so we'd like tests like this to pass.

There are several possible ways to resolve this -- changing the test for all execution environments, adding some kind of configuration test, etc. Before diving into that, I wanted to see if test262 is open to addressing issues like these that impact the use of test262 to validate JavaScript on constrained device targets.

jugglinmike commented 2 years ago

Thanks for the report!

For tail-call optimization, Test262 maintains a special value called $MAX_ITERATIONS which authors can use to set an abstract upper boundary on the number of stack frames necessary to prove TCO is taking place.

Test262 could take a similar approach here by offering a value like $LARGE_ARRAY_LENGTH. That would likewise empower authors to write tests without making assumptions about runtime capabilities. It would also reduce the number of arbitrary (or "magic") constants in test test material, and replace them with identifiers that more clearly reflect the intent.

But your point about conformance testing is well-taken. Unlike the TCO tests, the pattern you've identified isn't related to any specific normative text. The circumstances which motivate the use of a "large" array are at least partly implementation-specific. That would make the decision to apply $LARGE_ARRAY_LENGTH a little ambiguous. For that reason, I'd also support simply modifying tests like coerced-values-end-detached-prototype.js to use minimal array sizes.

rwaldron commented 2 years ago

@jugglinmike I like this solution.

jugglinmike commented 2 years ago

@rwaldron Thanks! Do you mean introducing $LARGE_ARRAY_LENGTH or removing code intended to place stress on resource limitations?

rwaldron commented 2 years ago

Oh, I thought it was two parts of a whole, but I see now that I misinterpreted that. I think we should introduce $LARGE_ARRAY_LENGTH for cases where length limitations are actually being tested (I'm thinking of Array tests that exercise ToIndex(), maybe?); and we should adjust any existing tests where size is not actually relevant to use the smallest size possible to effectively exercise the semantics under test.

phoddie commented 2 years ago

@jugglinmike - thank you for your detailed response like the idea of parameterizing the array length so the relevant tests can easily be tuned as needed. I'll work on generating a list of tests where would help with XS.

tc39 / test262

Unnecessary memory use in some tests causes false failures on constrained hardware #3328