Inspecting errors thrown from ShadowRealm

legendecas commented 2 years ago

The errors thrown from ShadowRealm, withShadowRealm.prototype.evaluate, WrappedFunction.[[Call]], and ShadowRealm.prototype.importValue, are all wrapped with TypeError created in the caller realm. Exposing inspection of these errors to the executing realm can be very helpful in the real world. However, the spec didn't have a word if the creation of the TypeError can cause side effects in the executing realm.

Like say:

const fn = new ShadowRealm().evaluate(`
function createErrorObject() {
  return {
    get message() {
      globalThis.called = true;
      return String('foobar');
    }
  }
}

function foo() {
  throw createErrorObject();
}

foo;
`);

try {
  fn();
} catch (e) {
  e.message; // <= 
}

In this case, can the message getter be called? should the return value of the getter be wrapped? Or the implementation should only provide a non-observable inspection -- only exposing the inner name and message when the name and message of the error objects are data properties of primitive values?

Edit: although the title says "the error thrown from ShadowRealm", this problem also applies to the WrappedFunctions created for passing arguments into the ShadowRealm.

ljharb commented 2 years ago

I would assume the thrown exception can't contain any object references from inside the shadow realm, which means it either can't have a getter, or, the object would be cloned such that the getter was wrapped. If the getter was wrapped, then it would set the shadow realm's globalThis.called, not the outer realm, which is fine.

Jack-Works commented 2 years ago

I would assume the thrown exception can't contain any object references from inside the shadow realm, which means it either can't have a getter, or, the object would be cloned such that the getter was wrapped. If the getter was wrapped, then it would set the shadow realm's globalThis.called, not the outer realm, which is fine.

It can access the getter once and copy the name & message (but should we?).

I don't think we should wrap the getter.

ljharb commented 2 years ago

it would make sense to me that all getters would be reified as own properties prior to crossing the realm boundary.

Since objects aren't permitted to cross the boundary, i don't think it's a reasonable expectation that obscure things about a thrown object will survive to the outer realm.

legendecas commented 2 years ago

It sounds good to me to reify name and message as own properties on crossing the realm boundary. However, the problem is when the reification is performed regardless of if accessing thrown objects' name or message may cause side effects, the following case may be surprisingly a dead loop -- even the code below didn't access the TypeError created in the caller realm:

const fn = new ShadowRealm().evaluate(`
function createErrorObject() {
  return {
    get message() {
      throw createErrorObject();
    }
  }
}

function foo() {
  throw createErrorObject();
}

foo;
`);

try {
  fn();
} catch {}

leobalter commented 2 years ago

This is interesting.

where (in specs) do we connect the getter of message to this new TypeError?

I'm not sure if I want to have a wrapped functions for it, but if triggering a get is too much, I'd rather not have any reference to a cross-realm message unless it is a data property.

mhofman commented 2 years ago

What happens if the error object is a proxy that claims it has an own message data property ?

mhofman commented 2 years ago

I think there are 3 options:

We always read the message to construct the error
We never read the message to construct the error
We create a new error object with a get message that basically wraps () => originalError.message

ljharb commented 2 years ago

why get message on a new error instead of just making a data property? message isn't a getter on built-in errors.

mhofman commented 2 years ago

To build the data property, you have to eagerly read the value on the original error (option 1), unless you make this new error object exotic.

ljharb commented 2 years ago

ahhh right, then yes that's what i'd suggest.

mhofman commented 2 years ago

To clarify, what do you suggest: eager read or using an exotic error object ?

ljharb commented 2 years ago

Eager read - like getters and setters, exotics should be eliminated as much as is possible :-D

mhofman commented 2 years ago

So I suppose the question is, what should the behavior be if accessing message on the original error throws (or if the original thrown thing isn't an error at all)

ljharb commented 2 years ago

i'd assume whatever it throws would then be cloned in the same way as the original error was attempted to be, and that is what makes it across. Certainly if the original thing doesn't have an [[ErrorData]] slot i wouldn't expect .message to be accessed at all.

Jack-Works commented 2 years ago

I think if reading the message throws, or the result is not a string, we ignore it.

mhofman commented 2 years ago

We discussed this in the SES meeting today, and given that ShadowRealm is an advanced API, I don't think it makes sense to try and provide some improvements to the handling of Error instances thrown across the callable boundary. Aka, if a wrapped function throws an error, the caller should not get any more information than a generic TypeError, with no name or stack information about where the original error was thrown.

However it was raised that we may want to allow values that would otherwise be allowed through the callable boundary to be thrown, aka pass thrown primitives as-is, and wrap thrown functions. This may simplify the handling of objects, including error instances, for membranes working across the callable boundary.

legendecas commented 2 years ago

@mhofman the problem about leaving the wrapping to userland is that values thrown from the shadow realm are not always from user codes.

Like, ShadowRealm.prototype.importValue can be rejected by the host with an error about target module not found, etc. In this case, there is no chance for userland code to wrap the thrown values. Without a mechanism to expose the info about the original error, it would be hard for people to figure out why ShadowRealm.prototype.importValue fails.

mhofman commented 2 years ago

Afaik, both .evaluate() and .importValue() can already throw fully detailed errors for syntax or other evaluation exceptions in the caller realm without going through the attenuation of the callable boundary, or am I missing the point of your example?

mhofman commented 2 years ago

Aka either the error is thrown by the ShadowRealm API itself, in which case it can include details about the error since the error is created in the caller realm, or the wrapped function is invoked, and the target code can make sure any error thrown during its execution is handled appropriately.

legendecas commented 2 years ago

Specifically, I mean the HostImportModuleDynamically can reject the promise with host defined errors, and step 13 of https://tc39.es/proposal-shadowrealm/#sec-shadowrealmimportvalue replaces the rejection with TypeError and discard the host defined errors. In HostImportModuleDynamically, user code may not get a chance to be evaluated.

mhofman commented 2 years ago

Ok so that case doesn't deal with the callable boundary, right? I'm not sure I grok the different places that can throw an error in this case, but maybe there is a way to let errors resulting from the parsing and loading of the module through without neutering them, while making sure runtime errors while evaluating the module end up as a caller realm TypeError, similarly to how PerformShadowRealmEval works?

legendecas commented 2 years ago

Ok so that case doesn't deal with the callable boundary, right?

I'm not sure I get your point clearly. The host errors are created in the shadow realm (the executing context on HostImportModuleDynamically is the one shadow realm created), so the error values still need to go through the boundary.

I'm not sure I grok the different places that can throw an error in this case, but maybe there is a way to let errors resulting from the parsing and loading of the module through without neutering them, while making sure runtime errors while evaluating the module end up as a caller realm TypeError, similarly to how PerformShadowRealmEval works

Right. However, if I recall correctly, Hosts like Node.js can expose module loader hooks to userland so user code can still get a chance to evaluate to "load the module". In this case, the problem can still arise that inspecting the error may cause side effects.

mhofman commented 2 years ago

By callable boundary in this case I meant the wrapping that is done by the wrapped function exotics.

Here it seems the issues you raise stem from the "entrypoints" into the ShadowRealm, aka HostImportModuleDynamically and possibly PerformShadowRealmEval, where something fails before obtaining a result value from the evaluation or import.

As I said, I don't fully grasp the layering for these operations, but as long as the following invariants are preserved, I think it's acceptable for errors to contain useful information:

User code (through host hooks or any other means) running in a given realm can never get direct access to "objects from another realm".
user visible stack traces from one realm do not seamlessly include stacks from another realm

(The above only applies where at least one ShadowRealm is involved, of course there is nothing preventing 2 legacy realms from leaking to each other)

syg commented 2 years ago

@mhofman I don't understand SES's recommendation here to not do anything about the errors. If HostImportModuleDynamically fails, it'll throw an error constructed in the shadow realm. That error contains the actionable message (e.g. "file not found") that we need to surface to the outside user of a shadowRealm.importValue() call. It's unacceptable to not surface that info IMO, otherwise how are you supposed to debug errors during importing?

At the very least I'd expect reading the .message property and copying it in the re-thrown error.

Edit: This opaque errors thing has bitten me already! In @legendecas's implementation in V8, we forgot to upload the imported module files in some importValue tests to test runner bots, and those bots were failing with a generic "cannot import value" error instead of a more descriptive "cannot find file" error. That made debugging much harder than it should've been.

mhofman commented 2 years ago

Ok I probably didn't express myself correctly.

Conceptually I have no problems with errors thrown in the ShadowRealm context with no user code on the stack to contain debugging information visible from the caller realm. That includes the host's module resolution mechanism, parsing or any early syntax errors.

My concern is errors thrown when user code running in the context of the ShadowRealm is on the stack. That includes errors thrown while executing a module graph's top level code, including top level await.

The first can only ever happen during the HostImportModuleDynamically and PerformShadowRealmEval operations, not the wrapped function exotic's [[Call]]. I don't know how the spec text can be changed to accommodate this, if at all possible, but I am entirely open to improve the user debugging experience for this specific case.

syg commented 2 years ago

Hm, I don't have time to dig in right now on how to give an allowance in spec text for actionable error messages if they're not user-originated, but that's a pretty important thing to allow. I don't think there is a way to distinguish errors that are user-originated vs not. Perhaps we'd need to manually examine throw completions from HostImportModuleDynamically and PerformShadowRealmEval to re-wrap messages with the actionable intact.

syg commented 2 years ago

I've put this issue on the agenda for the June meeting to get clarity on what should or should not be exposed on errors across ShadowRealm boundaries. It's uncontroversial that errors should not pierce the callable boundary via direct object references, but not at all clear to me:

Whether stacks should also be prohibited (it's not a specced thing, but good to get discussion on in that it is a reality)
Whether messages can be passed through
Whether we should, and how to, distinguish user-thrown and platform-thrown errors

mhofman commented 2 years ago

Thanks @syg. We plan to also discuss this in next week's SES meeting, if you're able to join.

annevk commented 2 years ago

So HTML does define serialization and deserialization of objects that have [[ErrorData]]. This should not end up contradicting that, right? How will that be ensured? (See also #336 which dealt with what seems like a very similar issue. My concerns are similar to those raised there, but perhaps I'm missing something.)

mhofman commented 2 years ago

@annevk I'm failing to understand what you mean. Objects with an [[ErrorData]] slot do not have a special treatment through the callable boundary, and are currently just objects which will cause a new TypeError.

In what case does HTML serialization come into play?

FYI, I am looking at an explicit structured cloning like API as a followup proposal to handle sharing objects through the callable boundary, but that's out of scope for the current proposal.

annevk commented 2 years ago

@mhofman I was confused by an earlier comment that's apparently no longer applicable. Feel free to hide my comment as resolved.

Jack-Works commented 2 years ago

What about Error Cause proposal?

littledan commented 2 years ago

I don't think that the way that ShadowRealm is an advanced API blocks us from creating some small conveniences for debugging issues in practice. Accordingly, I would suggest that, if an error is thrown which is not primitive or callable (so it is wrapped), then a new TypeError is created which eagerly reads the message property and wraps it as any other value across the ShadowRealm boundary. On the other hand, I would expect the browser-specific stack inspection APIs to not work--these would need more wrapping at the boundary to avoid a leak, significantly complicating them. Let's keep this simple and practical.

mhofman commented 2 years ago

an error is thrown which is not primitive or callable (so it is wrapped)

and wraps it as any other value across the ShadowRealm boundary.

but ShadowRealm does not wrap non-callable objects, it throws at the boundary. That's the whole point: why make thrown things special?

then a new TypeError is created which eagerly reads the message property

This would have to be specified in the ShadowRealm API proposal as it is observable behavior. Should it be any thrown object that we attempt to read the message property, or should we brand check Errors?

caridy commented 2 years ago

Update: We discussed this during plenary today, thanks @syg for the presentation and the context. The consensus there was to split this into two:

A) ergonomics of importValue module resolution errors. B) ergonomics of other error

Where A) being a blocker, and B) just a potential follow up proposal with some intersection semantics with other proposals focused on improving errors in general.

As for A), I believe we have few options:

Option 1

We can try to resolve this with small changes in ShadowRealmImportValue. It is relying on NewPromiseCapability, providing a custom onFulfilled argument, but a generic callerRealm.[[intrinsics]].[[%ThrowTypeError%]]. Instead, we can provide a custom onRejected argument that can give us more freedom to accommodate the actual error. The challenge here is to understand what the nature of the error is, and how to construct a similar error on the other side.

Btw, I don't think the spec is doing anything special right now, other than just passing the [[%ThrowTypeError%]], so the actual error is completely shadowed, not even inspected.

Option 2

to have a more specific operation based on HostImportModuleDynamically that can have a well defined Failure path that knowns how to surface the right information in the incubator realm.

I believe option 2 is what we had before but at some point during the refactors, that was removed all together for a more simplified solution. It is also interesting to think about option 2 as a way to go here because it opens the door to implemented a wrapped namespace object in the future more easily.

littledan commented 2 years ago

In the June 2022 TC39 meeting, we discussed this topic and drew the conclusion that there should be some improvement in the introspection of errors thrown from the other side of the shadow realm boundary. To make this more concrete, I'd suggest that, we go with a more uniform approach than the one in #370 . I'd suggest that, when errors pass over the ShadowRealm boundary and are rethrown as a new TypeError (step 11.a), that we add the following additional logic:

Get the message property of the inner object, assert it is a string, or leave it missing if it's not a string, and assign that to the new TypeError's message property
Prohibit implementations from exposing their general stack introspection APIs for errors from the other side of the ShadowRealm boundary, as this would either break isolation or not work
Leave it for implementations to decide whether they want to support string-based stack introspection APIs (If we want to set a uniform expectation, I'd be fine with the polarity being either way)

We should also use the same algorithm at the end of importValue, in the catch clause of step 13, to cover the case of errors from importing a module, as we all discussed in committee.

mhofman commented 2 years ago

Couple quick questions regarding that approach:

Should this copy be done for any object, or should there be a brand check for [[ErrorData]]?
Should this copy be done only for data properties, or should it invoke a getter if present?

If there is a brand check, and only a the message data property is allowed, the copy would not be observable by the program.

ljharb commented 2 years ago

imo it should definitely do a brand check, and it sounds fine to only do it for data properties.

littledan commented 2 years ago

I was proposing that there be no brand check, and that it would invoke a getter if present. The algorithm would just be a normal Get. What is the motivation for doing otherwise, or trying to limit observability?

mhofman commented 2 years ago

I was just asking for a clarification, I do not currently hold a strong opinion one way or another (well I'd probably be against an observable Get on the original error deferred to the time of Get on the new error).

caridy commented 2 years ago

Keep it mind something that was discussed in plenary, which is that there is no evidence, or strong desire to hide error information from the incubator realm, in fact it is desirable that the incubator can get a lot more info when possible. If course, this is a two ways street, and sometimes it is the shadow realm the one calling into the incubator realm, in which case the error should be completely hidden.

legendecas commented 2 years ago

@littledan: I was proposing that there be no brand check, and that it would invoke a getter if present. The algorithm would just be a normal Get. What is the motivation for doing otherwise, or trying to limit observability?

I'd be happy to see a uniform approach to transferring exceptions across the boundary. But I'm curious about how to handle the possible failure on the normal Get -- like in the example raised previously https://github.com/tc39/proposal-shadowrealm/issues/353#issuecomment-1071101645.

littledan commented 2 years ago

@mhofman Yes, I was picturing that the Get would be eager, when crossing the boundary, not lazy through a getter, for exactly the reason you state, as well as for general simplicity. I think this addresses the issue @caridy mentions, right?
@legendecas Yes, it does seem like that loop case is important. I would say that, if the Get throws, we simply leave the .message property undefined (just as I am proposing would happen if it is something other than a string).

ljharb commented 2 years ago

Leaving it undefined would conflate it with throwing an error with an actual undefined message property, no?

acutmore commented 2 years ago

With debugging being a core motivator, would one option be to keep the originally thrown error referenced in a hostInternalSlot of the new TypeError. Which hosts are permitted to read in a non-observable (to the application) way. e.g if the error was passed to the platforms built-in console.error it could log a more detailed message to stderr.

I can imagine a similar thing happening for errors thrown in their own task. e.g a setTimeout(() => { throw new Error(msg) }) happening inside the shadowRealm wouldn’t be passed back to the incubatorRealm, but the host could still log the uncaught exception.

mhofman commented 2 years ago

During the SES call yesterday I suggested constructing a new shadow realm error that captures the message and name of the original error, and is used as a cause for a typeerror thrown across the callable boundary. I'm not entirely sure we need the indirection, but it's an option. Anyway, if we have a custom error type, we could store a link to the original error, but being non observable, does the link actually need to be specified or should that be left to implementors choice

acutmore commented 2 years ago

we could store a link to the original error, but being non observable, does the link actually need to be specified or should that be left to implementors choice

Maybe the normative spec specifies what, if any, data can observably be cloned to the new error, and an editorial note suggesting additional, non-observable from the application, developer-experience improvements?

littledan commented 2 years ago

What engines have asked us for is guidance on what they should actually do, rather than latitude to do a wide range of things as such a note would leave them. (Note that a "MAY" is normative, not a note/editorial, and that might be what we want to do here.) Let's see if we can answer the implementers' question concretely.

littledan commented 2 years ago

We discussed this issue at length in the SES call this week. After some thought, I believe we reached the shared conclusion that it should be OK to copy a string .message to the new error, and that this level of transparency makes sense to apply "reflexively", that is, when crossing the ShadowRealm boundary in either direction.

We did not settle on whether to do an arbitrary Get of the "message" property, or whether to avoid calling arbitrary JS code by only getting the own "message" property of an original error object (as marked by an internal slot that Errors have, which proves that it's not a Proxy). Although I argued for the former above (out of simplicity), I'd be OK with the latter as well.

caridy commented 2 years ago

I believe we also agreed during the call that the type of error is not relevant or important, and that we will continue throwing the TypeError with the proper data attached to it.

tc39 / proposal-shadowrealm

Inspecting errors thrown from ShadowRealm #353

Option 1

Option 2