WebAssembly / component-model

Repository for design and specification of the Component Model
Other
897 stars 75 forks source link

Minor optimisation for store_string_to_utf8 #328

Closed GordonSmith closed 3 months ago

GordonSmith commented 3 months ago

Currently it is:

def store_string_to_utf8(cx, src, src_code_units, worst_case_size):
  assert(src_code_units <= MAX_STRING_BYTE_LENGTH)
  ptr = cx.opts.realloc(0, 0, 1, src_code_units)
  trap_if(ptr + src_code_units > len(cx.opts.memory))
  encoded = src.encode('utf-8')
  assert(src_code_units <= len(encoded))
  cx.opts.memory[ptr : ptr+src_code_units] = encoded[0 : src_code_units]
  if src_code_units < len(encoded):
    trap_if(worst_case_size > MAX_STRING_BYTE_LENGTH)
    ptr = cx.opts.realloc(ptr, src_code_units, 1, worst_case_size)
    trap_if(ptr + worst_case_size > len(cx.opts.memory))
    cx.opts.memory[ptr+src_code_units : ptr+len(encoded)] = encoded[src_code_units : ]
    if worst_case_size > len(encoded):
      ptr = cx.opts.realloc(ptr, worst_case_size, 1, len(encoded))
      trap_if(ptr + len(encoded) > len(cx.opts.memory))
  return (ptr, len(encoded))

But could be shortened to (no need for worst case length):

def store_string_to_utf8(cx, src, src_code_units):
  assert(src_code_units <= MAX_STRING_BYTE_LENGTH)
  ptr = cx.opts.realloc(0, 0, 1, src_code_units)
  trap_if(ptr + src_code_units > len(cx.opts.memory))
  encoded = src.encode('utf-8')
  assert(src_code_units <= len(encoded))
  cx.opts.memory[ptr : ptr+src_code_units] = encoded[0 : src_code_units]
  if src_code_units < len(encoded):
    trap_if(enc_len> MAX_STRING_BYTE_LENGTH)
    ptr = cx.opts.realloc(ptr, src_code_units, 1, enc_len)
    trap_if(ptr + worst_case_size > len(cx.opts.memory))
    cx.opts.memory[ptr+src_code_units : ptr+len(encoded)] = encoded[src_code_units : ]
  return (ptr, len(encoded))
GordonSmith commented 3 months ago

Note 1: There is also a bug in the removed code:

    if worst_case_size > len(encoded):
      ptr = cx.opts.realloc(ptr, worst_case_size, 1, len(encoded))
      trap_if(ptr + len(encoded) > len(cx.opts.memory))

When realloc is called with a smaller size to the current ptr, you can NOT assume that same address is returned (well in c++ guests anyway)...

Note 2: This has a nice upstream benefit as it results in a bunch of code removal?

lukewagner commented 3 months ago

The expected native implementation (that the Python is just describing the observable effects of) won't know len(encoded) (which I think is what you mean in your suggested code with enc_len) until the end of the copy. The goal of this whole routine is to avoid multiple passes over the string and thus we can't depend on len(encoded) in the way that your suggested code requires. With the spec as written, the implementation can simply blindly copy UTF-8 bytes until the initial allocation is full and only then resize to the worst case (which doesn't require knowing the exact final size).

The final shrinking realloc acknowledges the fact the realloc is allowed to change pointer location (hence the leading ptr = realloc(....)).

GordonSmith commented 3 months ago

Sorry for the noise - forgot the existing contents will get moved as part of the realloc, sorry for the noise.