tc39 / ecma262

Status, process, and documents for ECMA-262
https://tc39.es/ecma262/
Other
14.99k stars 1.28k forks source link

Consider replacing instances of "implementation-defined" with "host-defined" #1524

Closed annevk closed 4 years ago

annevk commented 5 years ago

This makes it (even) more clear what details are up to the engine and which are up to the host.

ljharb commented 5 years ago

What’s the difference between the two? It seems like “the engine” and “the host” are both an unobservable part of the implementation as far as the spec is concerned.

cc @allenwb

allenwb commented 5 years ago

What if there is no "host". It is certainly valid to implement ECMAScript as a self contained implementation that runs programs on bare metal or a black box operating system.

The distinction between the "engine" (a colloquial term) and the "host" is itself an implementation detail.

ljharb commented 5 years ago

Additionally, node probably shouldn’t have any constraints on what v8 behavior it can intercept and override, as long as the resulting implementation complies with 262.

annevk commented 5 years ago

If there is no host it doesn't matter, but if there is it does, from the perspective of the host. (Perhaps not for Node.js, but definitely for the web platform.)

(There's another issue I've been wondering about raising is that while there are various Host* abstract operations, the contract for them isn't entirely clear, especially now that ECMAScript defines concepts all the way up to agent clusters.)

littledan commented 5 years ago

I like the idea of this wording change. Implementation-defined sounds like it's up to the individual implementer, which doesn't give the host a chance to define it for a group of several implementations (e.g., the web platform, or Node.js if it has multiple implementations). I don't think JS works without a host. About V8, I don't think we should consider either "implementation" or "host" to refer to the particular software layering here.

jmdyck commented 5 years ago

@annevk and/or @littledan : Could you define/distinguish "host" and "implementation" from the point-of-view of the ES spec?

This makes it (even) more clear what details are up to the engine and which are up to the host.

Can you give an example where it's important for the spec to be clear about that?

annevk commented 5 years ago
allenwb commented 5 years ago

See: https://mail.mozilla.org/pipermail/es-discuss/2010-July/011531.html Prior to ES6 there had been long standing confusion about what the ES spec. meant when it made a distinction between "host objects" and "native objects". We ultimately eliminated both terms because we came to realize that, from the perspective of the ES specification, that distinction added no requirements or provided no useful information. But the existence of the terms was causing readers to search for some meaning that wasn't there.

It was also clear, that from the perspective of the ES spec, there was no meaningful difference between the "host" and the "ES implementation". Unfortunately, we made two ES6 mistakes that continue to muddy these waters. First we left in some legacy non-normative language in Clause 4 that dates all the way back to ES1. For example:

ECMAScript is an object-oriented programming language for performing computations and manipulating computational objects within a host environment.

We did add some new language, such as:

[ECMAScript was originally designed to be used as a scripting language], but has become widely used as a general purpose programming language.

We left in too much of the legacy text talking about non-normative concepts such as "scripting" and a "host environment". Coming in the first significant section of the standard, this leaks these terms/concepts into the specification and is obviously an on-going source of confusion. Clause 4 could use a complete rewrite, from a modern perspective.

The second, mistake we made was to allow the term Host to be introduced into new1 places in the specification. Specifically as the [[HostDefined]] field of JobRecord, the InitializeHostDefinedRealm and HostResolveImportedModules abstract operations, and a few uses of "implementation or host environment defined" in the specification of Jobs. These all came late in the ES6 spec. development process and were either mistakes on my part or not worth fighting to eliminate at the time. Unfortunately, they have apparently seeded a number of other "Host" thingies in subsequent editions.

The important thing from an ECMA-262 perspective is that there is absolute no difference in meaning between "the implementation" and "the host". They are both ways refer to say that there are potentially observable things whose specification is not part of ECMA-262. But to ECMA-262 it is irrelevant whether those things are defined by an "engine", a host application, an operating system, a user-defined configuration file, or something else.

Ideally, ECMA-262 should only use one term to talk about these eternal providers of observable specified behavior. It could be "host". It could be "implementation". From the perspective of writers and readers of programming language specifications, "implementation defined" is the most traditional term for this concept.

Finally, the fact that ECMA-262 says that something is "implementation defined" does not means that an "JS engine" has complete freedom to do whatever it wants. Other specifications can mandate a specific behavior when the language is used in some specific context. For example, a JS engine that is used in the web platform needs to conform to various specifications and standards besides ECMA-262. The HTML spec. should say what it requires a "JS engine" to do for the various occurrences of ECMA-262 "implementation defined" that the web needs to be interoperable among browsers.

Regarding @annevk observations:

implementation-defined manner if the host so choses, whereas in fact the host would typically also define the way they are to be created

Exactly my point about a difference that makes no difference. "host-defined" and "implementation-defined" mean the same thing from the perspective of ECMA-262.

the "HTML host" wouldn't want to allow any implementation-defined properties on the global object

See Global Object last bullet and second bullet Clause 16. Remember that "host-defined" and "implementation-defined" mean the same thing. In practice, it appears that most JS engines have occasionally defined non-standard global properties. In fact, every time an engine includes a global defined as an experimental or stage 3 or 4 feature ahead of its inclusion in a ratified ECMA-262 edition, it is implementing an "implementation-defined global property". Of course, the HTML spec. is allowed to say that implementations of ECMA-262 for the web platform must not defined any implementation defined global object properties other than the following list defined by the HTML spec.

HostResolveImportedModule says it's implementation-defined whereas in fact we have quite detailed requirements for it.

"host-defined" and "implementation-defined" mean the same thing from the perspective of ECMA-262.

I think it's useful for readers of the specification to be able to distinguish such cases from Math.acos and such

The HTML specification may not wish to place any requirements on Math.acos. But a data analytics platform that supports JavaScript might want to state such requirements.

1There are also some legacy normative uses of host/host environment, primarily relating to I18N locale that should have been cleanup up,

annevk commented 5 years ago

I don't see how they get to mean the same thing. Given two implementations of Host X, whatever is host-defined will behave the same, but whatever is implementation-defined can differ between those implementations. This is meaningful for implementers as they'll have to provide abstractions for things that are host-defined, but not those that are implementation-defined.

ljharb commented 5 years ago

Nothing in the spec cares about that difference though - can it help me understand why would it matter which part of “not the language” defines it, or why it would be important for the language to differentiate between different arbitrary building blocks of the language’s execution environment?

allenwb commented 5 years ago

I don't see how they get to mean the same thing. Given two implementations of Host X, whatever is host-defined will behave the same, but whatever is implementation-defined can differ between those implementations.

Not if the specification for Host X says something like:

  1. An implementation of Host X MUST include an implementation of ECMA-262
  2. The ECMA-262 implementation MUST implement the following "implementation-defined" characteristics of ECMA-262 as follows: a. Clause 8.5 step 8: The this binding for the realms's global scope is a WindowsProxy Object as specified in ... b. Clause 8.5 step 11: The following addition properties are defined on the global object: ... c. ...

This is meaningful for implementers as they'll have to provide abstractions for things that are host-defined, but not those that are implementation-defined.

If you are building an ECMAScript implementation for such a host you must full the requirements of ECMA-262 and the requirements mandated by the host. In general, the authors of ECMA-262 will not know which of its "implementation-defined" characteristics will be mandated by a host (if there even is a host) and which such a host is happy to leave unspecified.

annevk commented 5 years ago

Okay, but let's provide a clearer interface than a host having to replace specific clauses and steps as maintaining that over time would not be great. I suspect there are a number of things implementations would not ever want to see defined, but I'm okay with putting the burden on figuring that out on hosts.

I do think this stance is a little weird as TC39 does end up defining certain formerly implementation-defined details over time, meaning you are breaking (theoretical) hosts.

littledan commented 5 years ago

I'd rather avoid this sort of monkey-patching style of host specifications. This is brittle (e.g., if step numbering changes for editorial reasons) and it's hard to follow the cross-references (what do you search for?). I like formulating these things as explicit host hooks better.

allenwb commented 5 years ago

@annevk Not everything that is specified as "implementation-defined" is something that a "host" such as the web platform is likely to want to specify. A good example is the Math.acos accuracy. It would be pretty awkward for the spec. to have to provide an "explicit host hook" for all such situations.

If the concern is that implementation might not notice where something is implementation-defined, it would certainly be possible to add an Annex that list all such occurrences. We once made such a thing for ES3 and used it to evaluate which of its "implementation-dependents" should be eliminated.

@littledan

I like formulating these things as explicit host hooks better

That's exactly what we do now via the xxxHostyyy abstract operations. In fact, Clause 8.5 mentioned in the strawman example I provided:

a. Clause 8.5 step 8: The this binding for the realms's global scope is a WindowsProxy Object as specified in ...

is the InitializeHostDefinedRealm abstract operation. Like most such "host hooks" it is a combinations toofspecific requirements and several partially specified requirements that the implementation/host must complete. What would you do differently here other than in my strawman say "InitializeHostDefinedRealm" rather than "Clause 8.5"?

annevk commented 5 years ago

@allenwb seems like we agree? The concern is that I'd like it to be clear what a host is expected to specify and control, and that clear interfaces are provided for that. (Ideally in the form of abstract operations such as CreateAgentCluster(...), CreateAgent(...), AppendAgentToAgentCluster, etc. so hosts get to have the same rigor that ECMAScript has for itself.)

And on the other side, that bits not under host control, are also differentiated as such.

allenwb commented 5 years ago

@annevk Except on the use of the word "host". See https://github.com/tc39/ecma262/issues/1524#issuecomment-488008927

The distinction between an ECMAScript "engine" and an "ECMAScript host" just isn't meaningful from the perspective of ECMA-262. Saying either "host" or "engine" would bias ECMA-262 towards a specific factoring of responsibilities between a "host" and an "engine" and against other factoring such as a host-less implementation in an embedded environment or an application that directly implements the ECMAScript language as an indivisible part of the application.

The term "the implementation" does that carry such biases.

annevk commented 5 years ago

All I'm saying is that using the term "implementation" exclusively for both things TC39 might want to define in more detail in the future and for things that are pluggable by what's known as hosts today is confusing for readers, implementers, and those defining hosts. I don't care too strongly about what things are called in the end.

(I also have a somewhat different concern that the different pluggable bits are not clearly scoped and "holding everything right" is rather tricky. E.g., that InitializeHostDefinedRealm is to be done in the context of an Agent, which is created in the context of an Agent Cluster, is all rather hand-wavy.)

allenwb commented 5 years ago

In ancient times, one use of "implementation-defined" was for semantics where there was disagreement among implementations and TC39 was unwilling or unable to choose one. By the ES5 days we had learned that such indecision was poor policy and made an effort to eliminate as many of those usages as possible (and to not introduce new ones). The remaining ones were either places where pragmatic implementation variation was reasonable (eg, precision of Math functions) or places where implementations could add value (including external integration) without introducing interoperability concerns,.

I don't think you should be overly concerned about things in the latter case being retracted in the future. If there are specific such cases that you are worried about we should probably take a look at them.

You may be justified in your concerns about Agents/AgentClusters, I haven't seriously reviewed them from this perspective. If there are issues, I suspect "implementation-defined" terminology is not the primary problem.

More generally, I would agree that extra scrutiny should be applied any time "implementation-defined" (or any equivalent term) is added or removed from the specification. Specifiers of new features don't always have the fully history or perspective to make decisions that are consistent with past practice.

domenic commented 5 years ago

Addressing comments such as https://github.com/tc39/ecma262/issues/1524#issuecomment-488008927, https://github.com/tc39/ecma262/issues/1524#issuecomment-493462944, and https://github.com/tc39/ecma262/issues/1524#issuecomment-493544187, my perspective is as follows.

The spec does not exist in a vacuum; the purpose of picking specific words is not to align with historical usage, or to provide an ideal meaning that makes sense only in the context of ECMA-262. We should pick words that are maximally useful to the ecosystem.

In other words, as always, I refer us back to the priority of constituencies: users over authors over implementers over spec-writers over theoretical purity. Using a single catch-all, such as "implementation-defined", may be theoretically pure, but it disadvantages implementers trying to implement JavaScript as part of a larger piece of software (such as a web browser or Node.js), and disadvantages authors trying to understand how JavaScript works in such large settings in practice. It can even disadvantage users, when confusion on the implementer side leads to non-interoperable implementations that could otherwise be interoperable.

I think using different words to differentiate the different classes of not-defined-in-262 behavior has been working well so far, and encouraged a more interoperable ecosystem. In particular, the division between things like for-in order, Array.prototype.sort behavior, and Math function precision ("implementation-defined"), versus things like promise rejection tracking, realm initialization, string compilation restrictions, and error reporting ("host defined") has served us very well in building the web and Node.js ecosystems, and allowing us to coordinate. I hope we can continue this trend into the future, and not try to erase the distinction for reasons of theoretical purity or reasons of over-focusing on the spec-writer perspective from which both hosts and implementations may appear equally out-of-scope.

bakkot commented 4 years ago

The argument for having just one term is that using two words for the same concept is confusing to readers, not "theoretical purity". Currently, to my knowledge, nothing anywhere explains that or how they are different except @domenic's comment immediately above mine. And I don't think it's something a typical reader, even a typical implementor, would figure out on their own.

Anyway, if I understand @annevk and @domenic correctly, their goal is to distinguish between things that e.g. the HTML spec is expected to specify vs things that ultimately are going to be up to individual engines. I think that these are reasonable things to want distinguish. I just don't agree that they are currently distinguished. If we go this route, which I would be fine with, we would need to write down that this is what we are doing.

Alternatively, we could consistently use "implementation-defined" rather than "host-defined", but add a note to every case which is fully specified by the HTML standard noting that and where it is specified. Such notes would probably be useful even if did use different terms, actually.

erights commented 4 years ago

I agree that it is useful --- needed even --- to distinguish things that vary by host vs things that vary by implementation. I also agree that when we try to use these consistently, we should define what we mean, and we should go through the current spec to use the correct words in the correct places. Since no one is prepared to do that spec cleanup now, this can wait until someone is.

ljharb commented 4 years ago

I’m still unclear on how it is either useful or needed. When reading 262, there’s no difference to me between host or implementation - it’s either the language spec, or the thing implementing it, that i care about.

If 262 isn’t mandating it, a single other category is - whether we call that “host” or “implementation” doesn’t matter to me, but nobody has yet conveyed why that single category needs to be bifurcated from the perspective of 262.

bakkot commented 4 years ago

When reading 262, there’s no difference to me between host or implementation - it’s either the language spec, or the thing implementing it, that i care about.

Sure, but there are are other readers of the specification, some even in this thread, who have said that they do care about this difference. You are not the only reader of the spec.

If 262 isn’t mandating it, a single other category is

It is possible to draw your categories this way, but it is also possible to draw them in a different way, where this is not true. For example, you can draw them in a way which separates things for which a particular behavior is mandated by the HTML spec from things for which the behavior is truly unspecified.

devsnek commented 4 years ago

@ljharb for example v8 implements promises, and node implements HostPromiseRejectionTracker. v8 is the implementation and node is the host. obviously it isn't always that clear cut (v8 doesn't make node implement weakref cleanup) but the key is having the difference. (same example applies to v8 and chrome, which i think is what domenic was getting at with html)

allenwb commented 4 years ago

@ljharb @bakkot

A well-known Spook quote: "A difference that makes no difference is no difference."

Fundamentally, the purpose of ECMA-262 is to define the meaning of programs coded in ECMAScript. So, take ECMA-262 and (at least mentally) replace all occurrences of "host" with "implementation". Does it change the specified meaning of any program? I did this exercise on the normative parts of the ES6 spec. and I believe the answer is no. In either case, the spec. says that in certain circumstances the meaning of certain parts of a program is defined by some authority, external to ECMA-262. Whether you call that authority the "implementation" or the "host" is technically irrelevant.

Now do the same exercise, but instead of "implementation" replace "host" with "engine" or "container" or "processor" or "whatwg" or "brendan" or "google". Does it change the specified meaning of any program? No, it still says you have to consult some external authority to know the complete meaning of some programs.

But words do make a difference to people and influence their thoughts and actions. An ECMAScript implementation does not necessarily need to be "hosted" by another software platform or application. It is perfectly acceptable to implement ECMAScript such that that it runs directly against an operating system or on a bare-metal processor. Do we want ECMA-262 to leave the impression with potential ECMAScript implementors that some sort of "host" is required. I don't think we should any more than we would want to leave the impression that consultation with Google is required.

There are many words that we could choose to refer to the external authority(ies) that define the meaning of the parts of ECMAScript programs that are intentionally un- or under-specified by ECMA-262. But many of the possible words carry the sorts of unintended subtle implications we should avoid.

To me, "implementation" remains the most general and least encumbered (with unintended implications) word that comes to mind. After all, even when there are other "authorities" such as an host application or an operating system involved, it is the implementors of an ECMA-262 engine that makes the decision if and how to delegate to those authorities.

syg commented 4 years ago

My thoughts here: https://github.com/tc39/ecma262/pull/1903#issuecomment-601399410

annevk commented 4 years ago

I'm having a hard time understanding the controversy. The specification already has a number of Host hooks. It's fine if those default to implementation-defined absent of a host, similar to NaN handling (which is not a Host hook and if a host required specific NaN handling, implementations would object). But acknowledging there are (standardized) hosts and how things fit together is somewhat important.

Hosts could resort to monkey patch (which is what is happening to some extent around agents as they're a mess) but this usually leads to subtle bugs as it's unclear how the systems integrate. It seems somewhat essential for a thing that could be used as subsystem to expose a clear useful API.

allenwb commented 4 years ago

@annevk

The specification already has a number of Host* hooks.

Yes, and they are important. But the naming is unfortunate as they really are "implementation hooks" that apply whether or not there is a "host" that is distinct from an ECMAScript "engine". I'm partially responsible for the current naming as I agreed to it in the rush to complete ES6. I wish I hadn't.

It seems somewhat essential for a thing that could be used as subsystem to expose a clear useful API.

However, the Host hooks should not be thought of as APIs, because they are not. The Host hooks are a specification device used to express constraints that are imposed upon an implementation in how it provides various "implementation defined" functionality. The ECMA-262 specification does not require or expect that an "engine"1 actually expose an API that allows another software artifact (a "host") to provide such functionality.

My main concern is that having two concepts, "implementation defined" and "host defined" which generally mean the same thing introduces unnecessary complexity and potential confusion into the specification. It forces spec writers to unnecessarily make distinctions between engine and host requirements and such writers are likely to do so inconsistently (this was a problem in ES-262 editions prior to ES6 and possibly also after).

Finally, I appreciate that the largest use of ECMA-262 is in an application where there is a strong separation between the engine and the host (interesting, this distinction was probably historically a direct result of Conway's Law). That constituency is very important and vocal in representing their positions. But it is not the only significant use of ECMA-262 and I think it is useful present other perspectives.

1 or "implementation". To me those words mean the same thing, but in this instance the concreteness of "engine" seems more appropriate.

annevk commented 4 years ago

I think we're in agreement. When I wrote API above I meant only for specifications that build on the ECMAScript specification. I don't mean that I would expect implementations to do things exactly that way. Just like they don't have to implement other algorithms exactly as written. It's all about being blackbox indistinguishable.

But I do think having some distinction is useful as implementations don't want someone to settle NaN for instance. That has to remain in TC39's jurisdiction.

syg commented 4 years ago

We discussed this on the editor call. The editor group's plan is to:

Edit: Stay tuned.

erights commented 4 years ago

What does "blackbox indistinguishable" mean?

annevk commented 4 years ago

How will it allow a upstream spec to cleanly integrate? I really don't understand why we don't enshrine the host concept and just have a default implementation of a host that makes all host hooks implementation-defined.

syg commented 4 years ago

I see there is general unhappiness about the plan. I'll bring it up again at the next call.

syg commented 4 years ago

What does "blackbox indistinguishable" mean?

I was borrowing @annevk's phrase. I think it basically means observationally equivalent? An implementation can do whatever as long as it's observationally equivalent to invariants laid out for that specific operation.

syg commented 4 years ago

I really don't understand why we don't enshrine the host concept and just have a default implementation of a host that makes all host hooks implementation-defined.

From my chat with @annevk, recapped here.

Enshrining distinguished concepts normatively has, well, normative implications. Right now, "host" and "implementation", in the scope of ecma262, mean exactly the same thing normatively. If we enshrine the distinction and say e.g. Number::exponentiate is implementation-defined but not host-defined, that precludes other specs from imposing additional requirements. That's not a reasonable preclusion -- if there were a separate standard for all embedded JS engines, it is reasonable for that standard to require that e.g. the fdlibm implementation must be used.

Editorially, hosts and implementations definitely do not mean the same thing, and I'm still trying to come up with a solution that satisfies both TC39 and the HTML stakeholders here...

erights commented 4 years ago

The spec only traffic in observables. Anything that's not observable should at most be in a non-normative note. The things that are implementation-defined or host-defined are observable. Examples:

Sort algorithm is implementation defined. It is observable, for example, by the calls made on the comparefn, which can observe and report them.

Job scheduling order is host-defined. It is again observable.

Whether or not WeakMaps use the transposed representation can be determined with high probability by testing the relative performance of gc under different forms of stress. However, for purposes of the spec, this is not an observation.

allenwb commented 4 years ago

And, of course, it is unobservable whether job scheduling order or IEEE FP arithmetic or most other behaviors is provided by the "ECMAScript engine", a "host application", or a hardware processor.

A different that makes no difference is no difference.

erights commented 4 years ago

IEEE FP is not implementation/host/processor/whatever dependent. Ecma 262 explicitly standardizes on IEEE FP double precision round to even.

The only X-dependent issue remaining is which NaN bit representation gets exposed for NaN.

allenwb commented 4 years ago

IEEE FP is not implementation/host/processor/whatever dependent. Ecma 262 explicitly standardizes on IEEE FP double precision round to even.

My point is that it isn't directly observable to ES code which of those hypothetical layers (or others) actually implement it, as long as it conform to the requirements of ECMA-262. All requirement have that characteristic. Those layers are all part of "the implementation"

erights commented 4 years ago

If that's your point, then I don't understand the point of your point. The difference between IEEE double precision round-to-even implementations is not observable. The difference between sorting algorithms is observable. If you point sees these are being in a similar observability category, then I simply don't understand what you're trying to say.

allenwb commented 4 years ago

I wasn't trying to make a point about the differences between the category of ECMA-262 requirements and the category of ECMA-262 observable implementation-defined behaviors.

I was simply trying to reinforce the view that it is irrelevant, from the perspective of the ECMA-262, how an "implementation" is factored to support those categories. Whether an "implementation" is a monolithic blackbox (potentially all the way down to the silicon) or is structured into multiple black, white, and grey box just isn't relevant to ECMA-262.

annevk commented 4 years ago

I would be okay with equating implementation and host, if everything that was "implementation-defined" had a relevant abstract operation whose details are to be set by the implementation. The HTML Standard could then override some of these abstract operations and the whole setup would remain coherent.

I think it might still be a bit confusing as the abstract operations not overridden would be up to the actual implementation, not the HTML Standard, but if that allows ECMA-262 to pretend to live in a vacuum, so be it.

ECMA-262 should then also get rid of all mentions of "host" though.

ljharb commented 4 years ago

The plan here included removing mentions of the word "host-defined”

gibson042 commented 4 years ago

Resolution of this issue should also address "implementation-dependent", which appears almost twice as much as "implementation-defined" (albeit mostly as "implementation-dependent approximation" used in defining Math).

$ for what in host implementation; do for how in defined dependent; do echo "$what-$how: $(grep -io $what-$how spec.html | wc -l)"; done; done
host-defined: 2
host-dependent: 0
implementation-defined: 37
implementation-dependent: 66
$ grep -io 'implementation-dependent approximation' spec.html | wc -l
42