dfinity / motoko

Simple high-level language for writing Internet Computer canisters
Apache License 2.0
506 stars 98 forks source link

[Error] Locally iterating through a large for loop throws wasm trap #3466

Open ByronBecker opened 2 years ago

ByronBecker commented 2 years ago

Issue: heap overflow due to a huge number of allocated bignums

for (i in Iter.range(0, 200000000000000)) {};

Locally running a very long for loop through wasmtime (not in a dfx deployed local replica) results in the following error.

Error: failed to run main module `Test.wasm`

Caused by:
    0: failed to invoke command default
    1: wasm trap: wasm `unreachable` instruction executed
       wasm backtrace:
           0:  0x3b6 - <unknown>!rts_trap
           1: 0xcef0 - <unknown>!motoko_rts::trap_with_prefix::hbb665872bf969d20
           2: 0xb34d - <unknown>!motoko_rts::rts_trap_with::h91063078d15271f7
           3: 0xb5d3 - <unknown>!motoko_rts::memory::ic::grow_memory::hb99ecdf4fcf0660e
           4: 0xd226 - <unknown>!mp_calloc
           5:  0xf16 - <unknown>!mp_init
           6: 0xa09e - <unknown>!bigint_of_int64
           7:  0x5d2 - <unknown>!B_gt
           8:  0x6e2 - <unknown>!next
           9:  0x469 - <unknown>!init
          10:  0x73b - <unknown>!_start

Motivation

I was comparing the results of a Motoko integer encoding library with the algorithm the I adopted from JavaScript, and lazily tried to do this by generating encodings for all integers from 0 to 200 ~billion~ trillion.

To Reproduce

Clone and run -> https://github.com/ByronBecker/rts_trap_unreachable_error

ByronBecker commented 2 years ago

It's expected that this will happen as some point, just wanted to document this somewhere.

It looks like the max iterations I can get up to in Motoko is right around 1_000_000_000 (1 ~million~ billion).

As a quick comparison, I can push this same code, but in NodeJS at the very least 100X (100 ~million~ billion iterations)

// JavaScript/TypeScript code, run from node
for (let i=0; i<100000000000; i++) {); // takes a minute or so, but runs without complaint.

although the original 200 billion iterations also fails in Node, I can pass the --max-old-space-size parameter if I wish for a process to consume more memory (i.e. 4096 for 4GB).

ggreif commented 2 years ago

1_000_000_000 is one billion, right?

ByronBecker commented 2 years ago

1_000_000_000 is one billion, right?

🤦 You are right, I am way off. Probably time for some sleep 😆

Yes 1 billion, the numbers are more accurate than my own perception of them 😆

ggreif commented 2 years ago

So the problem is that comparisons to such a high bound will always drop us into bignum-arithmetic land. Which implies heap allocations on all comparisons (what you get above is heap exhaustion), unless we teach moc to cache the bignum. Alternatively we could compare against a MAX_SMALL_BIGNUM first (this doesn't allocate and is super fast), and if that fails, check against the real (cached) bound.

This could be an IR transformation when we know that the lower bound of iteration is well < MAX_SMALL_BIGNUM and we are comparing with Int above that. Or maybe just do this in the codegen phase.

ByronBecker commented 2 years ago

Makes sense, this is a low priority issue from my side by the way - I've just been toying around with creating lexicographic encodings of large integers and then bulk testing the library.

It's just as easy for me to make smarter tests and put upper bound limits on my library.

crusso commented 2 years ago

It probably doesn't help that the wasmtime backend will never do gc currently, right @ggreif, @nomeata?

ggreif commented 2 years ago

@crusso right, but bignum comparison (and other ops involving big constants) shouldn't unnecessarily cons like there's no tomorrow!