WebAssembly / memory64

Memory with 64-bit indexes
Other
194 stars 21 forks source link

Adding support for 64-bit call_indirect or full table64 support #46

Closed sbc100 closed 2 months ago

sbc100 commented 4 months ago

When targeting wasm64 today llvm currently truncates the i64 pointer operand to call_indirect since call_indirect currently requires a i32.

Should we add support for an i64 operand to call_indirect as part of this proposal?

If so, should we do this by adding the concept of 64-bit-indexed table?

There have been some discussion of this already in both #43 and in #10 since I thought it worth creating a separate issue.

sbc100 commented 4 months ago

My understanding is that this would not but much work from the implementation side. @backes and @eqrion perhaps you could confirm?

(To be clear this would not involve implementations having to increase the size of the tables they support)

sunfishcode commented 4 months ago

I don't see a strong motivation for this. Looking at #43 and #10, the concrete motivations seem to be:

eqrion commented 4 months ago

My understanding is that this would not but much work from the implementation side. @backes and @eqrion perhaps you could confirm?

(To be clear this would not involve implementations having to increase the size of the tables they support)

It would be a 'fair' amount of work. We'd have to update our codegen for call_indirect bounds checking to perform the truncation/overflow check, including on 32-bit systems where the 64-bit pointer is two registers. Nothing we've not done before, but also not just a couple days of work.

rossberg commented 4 months ago

IIRC, the problem (reported by @matthias-blume at the time) was that implementing the right behaviour for invalid function pointers when "safely" compiling something like C required manual bounds checks in user code, since simply truncating from 64 to 32 bit before call_indirect might accidentally result in a successful call through an invalid pointer.

There also is the question of coherence of Wasm itself. We generally tried to keep memories and tables as similar as possible, especially after tables were repurposed to a general storage for references with the GC proposal, tables now being to reference types what memory is to numeric types.

backes commented 4 months ago

My understanding is that this would not but much work from the implementation side. @backes and @eqrion perhaps you could confirm?

(To be clear this would not involve implementations having to increase the size of the tables they support)

Implementing explicit bounds checking of i64 values against 32-bit values (e.g. for memory accesses on 32-bit platforms) didn't turn out to be particularly hard in V8. I guess we could reuse most of that for 64-bit table indexing. I think "a few days of work" is the right bucket for this.

eqrion commented 4 months ago

So just to make sure I understand what's being proposed, this would be a general '(table i64)' feature even if we maintain a <2^32 implementation limit on their size?

So we'd need to:

  1. Allow the limits of a tabletype to specify an i64 index type
  2. Extend text format parsing of tables to allow specifying index type
  3. Extend JS-API table constructors to take index type
  4. Extend JS-API get/set methods to take i64 values? (might be superseded by an implementation limit)
  5. Update the following instructions to take i32 or i64, depending on the type:
    • table.get/set/size
    • table.grow?
    • call_indirect

I think this could be a reasonable feature, and the example of correct compilation that Rossberg mentions seems compelling.

matthias-blume commented 4 months ago

Yes, that sounds correct to me.

FWIW, the way C function pointers are currently compiled to WASM does not do any of the bounds checking. This is not "unsafe" because bounds are still being checked after the 64->32 bit truncation has occurred. However, this is less than ideal because an out-of-bounds 64-bit function pointer could become valid during the truncation, so we currently lose some fidelity in error checking there.

Thanks for considering this! Matthias

On Wed, Feb 21, 2024 at 2:33 PM Ryan Hunt @.***> wrote:

So just to make sure I understand what's being proposed, this would be a general '(table i64)' feature even if we maintain a <2^32 implementation limit on their size?

So we'd need to:

  1. Allow the limits of a tabletype to specify an i64 index type
  2. Extend text format parsing of tables to allow specifying index type
  3. Extend JS-API table constructors to take index type
  4. Extend JS-API get/set methods to take i64 values? (might be superseded by an implementation limit)
  5. Update the following instructions to take i32 or i64, depending on the type:
    • table.get/set/size
    • table.grow?
    • call_indirect

I think this could be a reasonable feature, and the example of correct compilation that Rossberg mentions seems compelling.

— Reply to this email directly, view it on GitHub https://github.com/WebAssembly/memory64/issues/46#issuecomment-1957860383, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB56SSVVHQSAG22IUGBUMADYUZKYZAVCNFSM6AAAAABDLDR52KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJXHA3DAMZYGM . You are receiving this because you were mentioned.Message ID: @.***>

sbc100 commented 3 months ago

I've put some time on tomorrow's CG meeting agenda to discuss this issue.

sbc100 commented 3 months ago

As per today's CG discussion lets have an informal poll here by voting thumbs up or thumbs down on OP of this issue.

sbc100 commented 2 months ago

Looks like we have 8 to 1 for adding this to memory64 proposal. @conrad-watt do you think we need to take this back to the CG or can we simply move ahead with this? I'm hoping we can simply start work on it immediately (perhaps we can do both.. i.e. start work and have a vote ASAP).

conrad-watt commented 2 months ago

Given the late stage of the proposal, this change should be confirmed with a pre-announced vote in the CG.

Given the straw poll here, I think it's ok for work to be started in anticipation of the vote result, but just be aware that if the vote goes a surprising way there may be the potential for wasted work!

rossberg commented 2 months ago

Given the late stage of the proposal, this change should be confirmed with a pre-announced vote in the CG.

Just want to note that this isn't a change, merely an addition, so champion's discretion could suffice? Technically, the only required votes are the stage votes.

conrad-watt commented 2 months ago

Technically, the only required votes are the stage votes.

There's no hard and fast rule on how much a proposal can be changed at phase 3, but assuming we're going with full table64 support we're essentially skipping a (small) feature-sized group of instructions past the regular phase process to phase 3. We should take extra care to make sure consensus on this decision is well-recorded, especially since the proposal has otherwise stayed unchanged for quite a while and there's no associated subgroup where regular discussions are happening.

It's not world-ending either way, but IMO it's better to take the extra meeting slot to be very deliberate about this and minimize surprise for the ecosystem.

conrad-watt commented 2 months ago

@sbc100 would you be ok with us slotting a brief confirmatory vote into tomorrow's meeting, or do you want this to settle for a little longer? (edit: see PR)

sbc100 commented 2 months ago

The poll on today's CG meeting was positive (all votes we in favor or neutral) so it looks like we will be going ahead with this.

sbc100 commented 2 months ago
  • Extend JS-API get/set methods to take i64 values? (might be superseded by an implementation limit)

This issue likely requires come consideration. In JavaScript i64 pointers tend to be represented as BigInt values and by extension it would make sense to allow these pointers to be used as arguments to the JavaScript .set and .get methods on tables.

However, in emscripten today it actually convert pointers to JavaScript Numbers (i.e. 53bit numbers) so for the emscripten POV it would be great if the .get and .set methods for 64-bit tables could either either Number or BigInt as arguments.

conrad-watt commented 2 months ago

What is currently done for 64 bit memories in the JS API? I vaguely remember we previously discussed a 2^53 JS implementation limit on memory size, to ensure that regular JS numbers could continue to be used as indices. If we followed through on this idea for memories, we could do the same for tables.

sbc100 commented 2 months ago

What is currently done for 64 bit memories in the JS API? I vaguely remember we previously discussed a 2^53 JS implementation limit on memory size, to ensure that regular JS numbers could continue to be used as indices. If we followed through on this idea for memories, we could do the same for tables.

Memory access is done via JavaScript typed arrays which can be indexed by both Number and BigInt. By analogy I suppose that .get and .set should also accept both.

sbc100 commented 2 months ago

Closing this issue for now. Implementation will be tracked in #51.