Closed kriskowal closed 1 year ago
The error messages are part of the code and consequently use flash space. Projects do run up against flash size limits. Making the messages bigger would make that more likely.
I suppose an optimal solution would allow messages to be even smaller than today while providing the option to be more verbose. That perfect world would benefit constrained systems while improving the debug developer experience.
As I recall, a related topic came up some time ago with @erights regarding the use of error messages, which are non-normative, to infer the engine currently executing a script. Mark's thinking at the time, if I recall well, was to replace error strings with error numbers. Those have more potential to be consistent across engines (much hand-waving here) and could be resolved to human readable strings (more hand-waving).
In the spirit of hand waving, perhaps we could have a table of error numbers to messages, and for the message representation to be a rope of shared substrings compiled from the table of error numbers to error messages.
Yes, something like that. I would kind of hope that the error messages don't get so big that using ropes is necessary (but I understand where such a mechanism would be valuable for some of your scenarios).
@phoddie you remember correctly.
With error messages being prose text, we're never going to get to a deterministic JS spec. Instead, for each place in the ecmascript spec where it mandates that the platform throws an error, we need to decide on a non-prose error message that we'd be willing to advocate for a deterministic JS spec. The programming environment --- everything from REPL to console
to debugger to IDE --- can then have the tables for rendering these into legible prose. Some of these simple messages will merely be unique codes like '#37'
. Some of these should also contain data, so that their human rendering can use that data together with the code to render legible prose. Something like '#42: "constructor"'
might render as 'Cannot assign to "constructor" due to the override mistake.'
. Rather than numeric codes, I suspect it would be better to standardize on short identifiers.
I imagine internationalization tables have some conventions for turning codes+parameters into localized sentences. If there are existing conventions for doing that, we should consider them. Otherwise, I think we can painlessly roll our own.
Ideally, the prose tables and their use for rendering would be sufficiently outside the deterministic JS computation that they would not appear in snapshots, and that the same shared deterministic computation can be localized differently for different observers.
Just to be clear, my understanding from the above discussion is that you would not accept PRs that tripled the length of error messages in order to provide a better clue to developers, unless those PRs came with engineering to minimize the impact on the resulting flash text increase. A PR to change no
to not
, on the other hand, would be less controversial. Do I read the room right?
Basically, yes. Because....
To that point, I understand Mark's goal of determinism and how error messages are problem. I'm not confident that error message can be normalized across engines. If that's true, maybe we should consider a different approach. The error messages are clearly not normative. As such, scripts should not be making decisions based on the messages. If that's strictly true (almost surely not, but...), then an engine configured to run in deterministic mode could suppress error messages entirely. Such an approach would be terrible for debugging, but perhaps there's an out-of-band solution there (for example, an internal slot accessible to debugging tools but not the script).
FWIW - I understand the use of "no" versus "not" is distracting but additional letter doesn't change much for developers. If we have a way to map in longer error messages, it would allow significantly more verbose messages, which is ultimately what would make a significant improvement.
It is an interesting point that the messages are non-normative, therefore code should not depend on them, therefore a valid program should be equally valid if all error messages are empty strings. With the SES-shim, we hide the stack from the program and reveal it to the console if it makes it that far. We also allow errors to be annotated, only allowing the console to reveal the annotations. We intend to treat errors as opaque objects for the purpose of currently hypothetical distributed debuggers, using out of band aggregation of the causal graph, stacks, and annotations). It would be equally valid to hide the message at a minor loss to ad hoc debugging.
This is a compelling long-term vision.
I'm not confident that error message can be normalized across engines.
I don't imagine the mainstream engines (v8, SpiderMonkey, JSC) to ever implement Deterministic JS. Initially I am only hoping that together we can write down a Deterministic JS spec that Moddable is willing to implement in a future XS. The purpose of that spec is to write down what a third party would need to implement so that their execution is a lockstep deterministic replay of execution on any other engine conforming to the Deterministic JS spec. Any virtual machine to be run on a public permissionless blockchain should have an equivalently deterministic spec. EVM and ewasm do.
The relationship between Deterministic JS and standard EcmaScript should be that the former is a refinement of the latter. This means that any conforming implementation of Deterministic JS is also a conforming implementation of standard EcmaScript.
Another example: Deterministic JS must specify the sorting algorithm used by Array.prototype.sort
, since it is observable. I expect we'll specify whatever XS currently does, if XS does something reasonable to specify. I don't imagine we'll ever get V8, SpiderMonkey, or JSC to agree to sort with that algorithm.
An open question is what standard org if any we take the Deterministic JS spec to. It may be tc39, tc53, or the new TC in formation for blockchain interoperability standards. We don't need to figure that out until we have a Deterministic JS spec, which will probably be many years away.
If that's strictly true (almost surely not, but...), then an engine configured to run in deterministic mode could suppress error messages entirely. Such an approach would be terrible for debugging, but perhaps there's an out-of-band solution there (for example, an internal slot accessible to debugging tools but not the script).
That's exactly what E did. It was wonderful. The error/assert/console library I added to the SES shim takes a strong step in that direction, but without omitting the in-band error messages entirely. I agree this is something to consider for Deterministic XS. The error/assert/console approach to the out-of-band info seems good.
The same consideration applies to error stacks. The SES shim removes the stacks for in-band access from the error objects itself, but provides them out-of-band where our console can get them.
This topic seems to be in a good place and have consensus on long term direction and goals.
From an XS perspective, I think possible near term steps are around error description mapping. Specifically mapping the current short descriptions to long descriptions to be friendly to developers by providing more information about the problem and mapping the current short descriptions to empty strings to be friendly to developers by providing more space for their code. ;) The SES/Deterministic JavaScript approach likely wants both mappings -- providing empty descriptions to scripts for determinism while proving long (or short) descriptions out-of-band for debugging.
I agree that empty strings are best for Deterministic JS for in-band error messages. Likewise empty stacks, which the SES shim currently builds for itself starting with the API you currently implement (Error.prototype.string
accessor that we delete, after grabbing its setter). For out-of-band info, the more the better. But a unique tag we can look up is fine rather than building more prose into the engine. The more significant issue is the data parameterizing that tag, such as the property name 'constructor'
in the previous example. Our console would then use all these out-of-band channels to render errors with descriptive messages and stack that are useful for debugging.
For out-of-band info, the more the better. But a unique tag we can look up is fine rather than building more prose into the engine.
Understood. I think the hard part is coming up with the mappings, including substitutions. We do something like that in our Piu UI framework for localization, but that approach probably isn't right for the engine itself. Once that is solved, we can sort out where (engine, debugger, etc) to apply the mapping.
How direct a correspondence is there to the places in the XS implementation that decide to throw (and what to throw) vs the Ecma262 spec having a step that says that an error of a particular type must be thrown?
I think I know where you are heading with this... I may have had a similar thought. I'm confident that the XS implementation is closer to the spec on that than most engines, but it would be some work to accurately characterize that.
@erights – Part of what is ugly here is maintaining a mapping from the unique tag to the full message. Plus, it has long seemed impractical to get engines to agree on error messages. Maybe we can look at it differently? The primary goal is to eliminate the error messages as a source of entropy and as a way to distinguish engines. What if the error instance thrown has a message
of the empty string and an internal slot with the real error message? XS can provide the host with a separate function to extract the actual message from instance and the Agoric runtime can hide that function from scripts it executes in Compartments. (Perhaps XS can limit hiding of the real error message to code executing in Compartments?)
This approach works with @kriskowal's excellent goal to provide more descriptive error messages. When the host extracts the real error message, it can apply a transformation through any convenient means, to provide additional detail.
This is also consistent with our notion that an error object under Hardened JavaScript should be opaque to intermediate call frames. The most secure position is somewhat developer hostile in environments that don’t automatically unbox errors for the developer, so our position is just shy of dogmatic in ses
. We have gone to lengths to ensure that error details get automatically revealed to console
through SES, but we have not done this for message
.
but we have not done this for
message
.
What @kriskowal says here is true for errors thrown from the engine. But for user defined errors thrown using our assert
library, we actually put a lot of work into
error.message
See https://github.com/endojs/endo/blob/master/packages/ses/src/error/README.md if you're curious. It is long and not really needed for this thread. But it's a fine read!
@erights – partial redaction of error.message
is quite something! (I did read the link on error logging in Endo. That was so interesting it took me down a (magic) wormhole into the extensive "survey of logging frameworks".)
@kriskowal – that all makes sense. I can't imagine that your SES implementation wraps every built-in that might throw to make the error message opaque. Or do you??
This topic has been pending for some time (longer than this issue). I think we are closing in on a workable solution. Are we close enough that it makes sense to explore an implementation in XS?
As you suspect, we do not wrap every built-in that might throw an error. We
just provide assert
features to make it easy to redact and reveal parts
of messages.
On Tue, Oct 25, 2022 at 9:41 AM Peter Hoddie @.***> wrote:
@erights https://github.com/erights – partial redaction of error.message is quite something! (I did read the link on error logging in Endo. That was so interesting it took me down a (magic) wormhole into the extensive "survey of logging frameworks".)
@kriskowal https://github.com/kriskowal – that all makes sense. I can't imagine that your SES implementation wraps every built-in that might throw to make the error message opaque. Or do you??
This topic has been pending for some time (longer than this issue). I think we are closing in on a workable solution. Are we close enough that it makes sense to explore an implementation in XS?
— Reply to this email directly, view it on GitHub https://github.com/Moddable-OpenSource/moddable/issues/643#issuecomment-1290849061, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAOXBTTYRND57BLPESJLCDWFAEUBANCNFSM45FQL4SQ . You are receiving this because you were mentioned.Message ID: @.***>
@erights – partial redaction of
error.message
is quite something! (I did read the link on error logging in Endo.
Thanks, glad you enjoyed it!
That was so interesting it took me down a (magic) wormhole into the extensive "survey of logging frameworks".)
IIRC mostly by @warner and @fudco . We still need to build a logging framework that addresses the dominant motivation of these --- logging potentially voluminous symbolic data for consumption by other tools, with only digested diagnostic info presented to humans. SwingSet's slogfiles do some of this, but specialized for SwingSet rather than as something available to regular vat code.
(None of which is actually relevant to the point of this thread though)
I was hoping we might be able to make some progress on this based on the idea from October 20. That approach would deny untrusted code running under XS access to error messages, eliminating a source of non-determinism. If/when that becomes a priority, let's re-open this issue to revisit the details. Until then, I'm going to close this out.
Describe
XS error messages are terse. Would Moddable consider PRs that increase the expressivity of XS error messages?
Why do you think this feature would be useful?
If for example, the error message
no argument
were more descriptive, likeBigInt.prototype.toString requires 'this' to be a BigInt
, developers would be able to more directly address exceptions.Describe alternatives you've considered
XSBIGINTARG1
.