FuelLabs / sway

🌴 Empowering everyone to build reliable and efficient smart contracts.
https://docs.fuel.network/docs/sway/
Apache License 2.0
62.64k stars 5.36k forks source link

Monomorphization implementation is not true monomorphization #862

Closed emilyaherbert closed 2 years ago

emilyaherbert commented 2 years ago

The implementation of monomorphization is currently not actually monomorphization but instead a close approximation. This is causing implementation difficulties.

Given this Sway code:

script;

struct DoubleIdentity<T, F> {
  first: T,
  second: F
}

fn double_identity<T, F>(x: T, y: F) -> DoubleIdentity<T, F> {
  DoubleIdentity {
    first: x,
    second: y
  }
}

fn main() -> bool {
  let double_a = double_identity(true, true);
  let double_b = double_identity(10u32, 43u64);

  let double_a2: DoubleIdentity<bool, bool> = double_identity(true, true);
  let double_b2: DoubleIdentity<u32, u64> = double_identity(10u32, 43u64);

  true
}

the resulting types of the variable declarations during codegen are:

Struct { name: "DoubleIdentity", fields: [OwnedTypedStructField { name: "first", type: 4262 }, OwnedTypedStructField { name: "second", type: 4263 }] }
Struct { name: "DoubleIdentity", fields: [OwnedTypedStructField { name: "first", type: 4283 }, OwnedTypedStructField { name: "second", type: 4284 }] }
Struct { name: "DoubleIdentity", fields: [OwnedTypedStructField { name: "first", type: 4294 }, OwnedTypedStructField { name: "second", type: 4295 }] }
Struct { name: "DoubleIdentity", fields: [OwnedTypedStructField { name: "first", type: 4319 }, OwnedTypedStructField { name: "second", type: 4320 }] }

meaning that monomorphization has produced (in approximate code):

script;

struct DoubleIdentity<bool, bool> {
  first: bool,
  second: bool
}

struct DoubleIdentity<bool, bool> {
  first: bool,
  second: bool
}

struct DoubleIdentity<u64, u64> {
  first: u64,
  second: u64
}

struct DoubleIdentity<u64, u64> {
  first: u64,
  second: u64
}

fn double_identity<T, F>(x: T, y: F) -> DoubleIdentity<T, F> {
  DoubleIdentity {
    first: x,
    second: y
  }
}

fn main() -> bool {
  let double_a = double_identity(true, true);
  let double_b = double_identity(10u32, 43u64);

  let double_a2: DoubleIdentity<bool, bool> = double_identity(true, true);
  let double_b2: DoubleIdentity<u32, u64> = double_identity(10u32, 43u64);

  true
}

instead of what "true" monomorphization produces:

script;

struct DoubleIdentity<bool, bool> {
  first: bool,
  second: bool
}

struct DoubleIdentity<u64, u64> {
  first: u64,
  second: u64
}

fn double_identity<T, F>(x: T, y: F) -> DoubleIdentity<T, F> {
  DoubleIdentity {
    first: x,
    second: y
  }
}

fn main() -> bool {
  let double_a = double_identity(true, true);
  let double_b = double_identity(10u32, 43u64);

  let double_a2: DoubleIdentity<bool, bool> = double_identity(true, true);
  let double_b2: DoubleIdentity<u32, u64> = double_identity(10u32, 43u64);

  true
}

This is because monomorphization happens each time a struct is created: https://github.com/FuelLabs/sway/blob/fb93b1b438141c282dabbb5c8abb544870c91962/sway-core/src/semantic_analysis/ast_node/expression/typed_expression.rs#L873

In contrast, Rust performs monomorphization by first going through a "collection phase" where all types used for generics are collected, and this information is used for monomorphization: https://rustc-dev-guide.rust-lang.org/backend/monomorph.html#collection

As it stands, there may be no practical difference between "true" monomorphization and the current implementation, aside from additional complexity resulting from duplicate definitions. However there is motivation for changing it going forward:

I propose that we consider restructuring monomorphization to be more akin to "true" monomorphization, with these issues as the motivation.

sezna commented 2 years ago

Just to provide some context around how methods for these monomorphized types were going to work in my original implementation (although remain unimplemented)...

Assuming type id 1 and 2 are both resolving to bool, then they are considered "the same". It is imperative that this holds. In the namespace, all methods are stored in the list of implemented traits. When we look for methods for a type, we check that the two types are equal. The reason this is failing for monomorphized structs is because the auto-derived PartialEq method for TypeInfo doesn't utilize the type engine, instead directly comparing the underlying usizes.

We need to make it such that Struct { a: TypeId(1), b: TypeId(1) } is equal to Struct { a: TypeId(2), b: TypeId(2) } if 1 can unify to 2.

tl;dr: the auto-derived PartialEq implementation doesn't use the type engine to determine equality

sezna commented 2 years ago

I'm not sure what exactly you mean by true -- in the type system, if the types that are referenced by two ids are equal, then they should be the same type. That's where this is falling apart, not the fact that we use type ids so liberally.

emilyaherbert commented 2 years ago

I am proposing that instead of making one monomorphized copy per enum/struct/function use, leading to potentially many copies of the same type definition, that instead we only make one copy per type.

sezna commented 2 years ago

I'm not sure I see the benefit -- that's a pretty hefty rewrite/restructuring, and the purpose of the original design is that it is cheap and easy to monomorphize all over the place and delegate the work of resolving the types to the type engine. If things are working correctly, there should be lots of differing type ids.

If the equality of types is defined in terms of its representation in the type engine, and not by raw type ids, would that accomplish the same goal, without having to do all of that rewriting?

emilyaherbert commented 2 years ago

This will be solved by #1821.

emilyaherbert commented 2 years ago

Closing in favor of #2636