jamesjuett / lobster

Interactive Program Visualization Tools
8 stars 3 forks source link

Virtual function calls don't work correctly on array element objects #289

Open jamesjuett opened 2 years ago

jamesjuett commented 2 years ago

Example:

int main() {
  Duck ducks[2];
  Bird *b = &ducks[0];
  b->talk(); // should print quack, actually prints tweet
}
iteemhe commented 2 years ago

I identified this issue. In runtimeEnvironment.Memory.dereference. This part will cause this error.

        // Grab object from memory
        var obj = this.objects[addr];

        if (obj && (similarType(obj.type, ptr.type.ptrTo) || subType(obj.type, ptr.type.ptrTo))) {
            return obj;
        }

        // If the object wasn't there or doesn't match the type we asked for (ignoring const)
        // then we need to create an anonymous object of the appropriate type instead
        return new InvalidObject(ptr.type.ptrTo, this, addr);

When trying to this.objects[addr] in an array, we intend to get an object in that array by address. However, since this.objects stores array as an object, so this.obejects[addr] actually returns an array or undefined instead of an object, because the memory address of array elements is not in this.objects.

Screen Shot 2022-02-21 at 13 44 46

Then, this if-statement will always be false. Either because types do not match, since we are comparing array type and a class, or undefined object, because this.objects only stors the start address of array.

        if (obj && (similarType(obj.type, ptr.type.ptrTo) || subType(obj.type, ptr.type.ptrTo))) {
            return obj;
        }

The, after the failure comparison, this function will return an InvalidOjbect using the type of pointer(Bird in our case), which will let FunctionEntity.getDynamicallyBoundFunction always returns the function of static type instead of the dynamic type.

iteemhe commented 2 years ago

For example, the start address of duck is 10004, and the first duck is at 10004 too. And the second duck is at 10008. If we want to access 10004 by address from this.objects. These two lines will fail for some reason.

        if (ptr.type.isArrayPointerType()) {
            return ptr.type.arrayObject.getArrayElemSubobjectByAddress(addr);
        }
        if (ptr.type.isObjectPointerType() && ptr.type.isValueValid(addr)) {
            return ptr.type.pointedObject;
        }

Then, if we want to access 10008 in this.objects, this will return undefined, because this address does not exist in this.objects, instead, it exists in that array.

Screen Shot 2022-02-21 at 14 41 48

I haven't figured out how to fix this issue, because the complex inheritance relationship and interface. I think there is a way to solve the fix ducks[0]. However, it is hard to solve ducks[1], because we need to traverse memory then find the correct object on that address.

jamesjuett commented 2 years ago

When a pointer to an array object is obtained, there's a special runtime type represented by ArrayPointerType that stores information about the array the pointer originally came from, which is how Lobster does things like runtime bounds checking on pointers. When an array element is accessed, there's eventually a call to getPointerTo() on the ArraySubobject that yields this pointer.

image

There's also a special ObjectPointerType that you get when taking the address of most anything else and that tracks the runtime object that it points to. This turns out to be the way that Lobster enables polymorphism, since the dynamic type is simply retrieved from the runtime type of the actual real object you're pointing to. The source of the problem in this issue is actually in the upcast to Bird *b.

Currently, this upcast is implemented as a PointerConversion, which is a kind of NoOpTypeConversion: image image

In the above, the conversion creates a new Value wrapper around the same raw value, but with the destination type of the conversion, this.type, which is Bird * in this example. However, in so doing it discards the runtime type information (RTTI) that was present via the special ArrayPointerType.

The conversion here needs to be more complex and account for RTTI. It wouldn't be correct to simply keep the RTTI rather than discarding it, because if you have this situation:

int main() {
  Duck ducks[2];
  Duck *d = &ducks[0]; // or = ducks, or = ducks + 1, or whatever
  Bird *b = &ducks[0];
}

d is rightly an ArrayPointerType and can be used for things like pointer arithmetic within the bounds of the array.

But b should not be an ArrayPointerType, because it has lost the ability to safely navigate the array via pointer arithmetic, since Bird and Duck may not be the same size.

You could also look at an example without an upcast:

int main() {
  Duck ducks[2];
  (&ducks[0])->talk();
}

And since the &ducks[0] portion yields an ArrayPointerType rather than an ObjectPointerType, it can't behave polymorphically. But, I don't think it ever needs to, since its static and dynamic type are the same (unless it goes through a cast, as before).

I think these general rules make sense: