chakra-core / ChakraCore

ChakraCore is an open source Javascript engine with a C API.
MIT License
9.09k stars 1.19k forks source link

Review Hosting API #6416

Open rhuanjl opened 4 years ago

rhuanjl commented 4 years ago

The CC external API also called JSRT could do with a review prior to the next release.

Roughly speaking it contains:

  1. "Windows only APIs": These are historic chakra.dll APIs that pre-date the cross platform nature of chakracore; they can actually be used on linux or macOS with some caution - notably they use strings that are rows of 16bit characters - normal on windows but not normal on other platforms.

  2. Settled Chakracore APIs: Methods included in version 1.11 and in most cases several previous versions - many of these are marked as "experimental" in documentation but in practise have been stable for some years and should probably be labelled as such.

  3. Master only APIs: Various APIs have been added in master, some of these are documented, some aren't - in particular several were added in #5923 that seem to follow a very different design philosophy to all the previous APIs. We should review these and consider removal or documentation on a case by case basis prior to a stable release.

Overall I think we should aim at maintaining an API with a clear design philosophy that works in a consistent way AND supports stability and is relatively easy to maintain - part of this is keeping it lean hence my aversion to some of the additions from #5923.

zenparsing commented 4 years ago

Agreed - since we want CC to be an excellent engine for embedding we should give the API a lot of thought.

There are also some functions that are just really awkward (I'm thinking of JsSetModuleHostInfo in particular). Do you think we could get away with making some breaking changes just for 1.12, to set us on a really solid path?

Taritsyn commented 4 years ago
  1. "Windows only APIs":

In .NET I have long since given up using the APIs, that defined in the ChakraCommonWindows.h file. Usage of this APIs caused crashing of performance tests written by using the BenchmarkDotNet library:

// * Warnings *
Environment
  Summary -> Detected error exit code from one of the benchmarks. It might be caused by following antivirus software:
        - Windows Defender (windowsdefender://)
        - Norton Security (C:\Program Files\Norton Security\Engine\22.15.0.88\WSCStub.exe)
Use InProcessToolchain to avoid new process creation.

Having completely abandoned this APIs, nevertheless, some OS-specific usage remained. For example, on Windows I use the JsCreateStringUtf16 and JsCopyStringUtf16 methods to work with strings, and on Linux and Mac I use JsCreateString and JsCopyString methods. Unfortunately, now there is lack version of the JsCreatePropertyId method that would support UTF-16 encoding.

rhuanjl commented 4 years ago

@zenparsing

Agreed - since we want CC to be an excellent engine for embedding we should give the API a lot of thought.

There are also some functions that are just really awkward (I'm thinking of JsSetModuleHostInfo in particular). Do you think we could get away with making some breaking changes just for 1.12, to set us on a really solid path?

I'm aware that the module API as a whole is not great - it has some awkward idiosyncrasies which I think stem from some missteps in its original implementation. JsSetModuleHostInfo probably has too many roles and the fact that it expects you to provide a moduleReference when half of its uses really need a scriptContext instead is silly. Additionally I think the whole logic of how ParseModuleSource works internally is unhelpful - which in turn means the callbacks you have to set up and the way the process works are not ideal.

This is probably the only part of the "settled" API I'd consider changing - though I'd be hesitant even here - could we perhaps introduce a better version and leave the existing one knocking around for backwards compatibility?

@Taritsyn

  1. "Windows only APIs":

In .NET I have long since given up using the APIs, that defined in the ChakraCommonWindows.h file. Usage of this APIs caused crashing of performance tests written by using the BenchmarkDotNet library:

I was not aware of this breakage - I wonder if we should entirely deprecate the "windows only" stuff - I'd like to keep it for backwards compatibility but I'm not keen on putting effort into fixing it. BUT there are a few things from it I could see trying to rescue - such as JsGetStringPointer which exposes a pointer to the internal string behind a Js String without copying it - there's potentially significant performance benefit to be had in using that in some cases.

Having completely abandoned this APIs, nevertheless, some OS-specific usage remained. For example, on Windows I use the JsCreateStringUtf16 and JsCopyStringUtf16 methods to work with strings, and on Linux and Mac I use JsCreateString and JsCopyString methods. Unfortunately, now there is lack version of the JsCreatePropertyId method that would support UTF-16 encoding.

It sounds like a JsCreatePropertyIdUtf16 would be a sensible addition.

rhuanjl commented 4 years ago

@zenparsing I'm thinking more about that module API and weighing up my desire to avoid breakage against what a good API would actually look like:

  1. The JsSetModuleHostInfo method is frankly horrid in part due to my own past additions to it, mea culpa, it has too many different purposes and the variety of activities that require in some cases a script context and in other cases a moduleRecord are confusing.
  2. The API gives you 3 different ways of obtaining an error from a module being loaded - this is unnecessary.
  3. Considering that the various properties you may wish to set on an individual module are all available when creating it it doesn't make sense that you can't supply them to JsInitializeModuleRecord and have to use JsSetModuleHostInfo repeatedly instead.
  4. The basic mechanism for loading a module is unnecessarily abstruse and the interactions with the host are different to the spec.
Activity ChakraCore Spec
List of Child Modules to fetch Callback fired for each one Method for Host to call to request list
Link module graph to prepare for execution Done automatically when last module in graph is parsed Host expected to track loading process and call a method when ready
Triggering execution Notification via callback from link to trigger As Link is synchronous host can call execution immediately after

All that aside - whilst CC's method is initially hard to understand I think it actually requires the host to do a whole let less to implement it than the spec envisages. Notably per spec the host has to track/determine if the whole module graph is loaded whereas CC does that for you - though I do wonder if/how CC's tracking would hold up if a user calls ParseModuleSource from a different thread - I fear that that could lead to undefined behaviour.

I'd be interested what anyone else thinks of this and whether people would prefer:

  1. Sticking to the existing API
  2. Making some improvements to the existing API - possibly with new functions sitting side by side with the old e.g. one method for setting the callbacks on the context. And a new JsCreateModuleRecord method that sets all the information to avoid repeated JsSetModuleHostInfo calls.
  3. Adding a new side by side API so maintaining backwards compatibility whilst adding a nicer alternative for new projects OR
  4. Completely replacing the existing API (assume that not many existing projects use modules and hence go to town on making it nicer for future projects)

One extra note, the module implementation within ch is more complicated than it should be due to specifically supporting synchronously throwing for syntax errors in root modules which the API isn't really designed to support - as the module load process is intended to be async. (though ParseModuleSource does synchronously tell you about syntax errors)

rhuanjl commented 4 years ago

Having another think on modules - with a possible path for alternate methods that could sit alongside the existing ones:

// create a module record with the given normalizedSpecifier
// optionally a url can be supplied which will be used in error stack traces
// optionally hostDefined data can be supplied which can then be accessed via JsGetModuleRecordInfo
JsCreateModuleRecord(
    _In_ JsValueRef normalizedSpecifier,
    _In_opt_ JsValueRef url,
    _In_opt_ void* hostDefined,
    _Out_ JsModuleRecord moduleRecord);
// retrieve data about a module record
JsGetModuleRecordInfo(
    _In_ JsModuleRecord moduleRecord,
    _Out_opt_ JsValueRef normalizedSpecifier,
    _Out_opt_ JsValueRef url,
    _Out_opt_ void* hostDefined);
// set a module callback on the current context
// see JsModuleCallbackKind enum for callback types
JsSetModuleCallback(
    _In_ void* callback, 
    _In_ JsModuleCallbackKind callbackKind);

With these three APIs we'd eliminate the need for JsInitializeModuleRecord, JsSetModuleHostInfo and JsGetModuleHostInfo - BUT - we wouldn't break the mechanisms they sit on top of, so we could add these and leave the existing ones as deprecated but still functional for backwards compat.

zenparsing commented 4 years ago

@rhuanjl Yes, I think deprecating the old module API is the way to go.

rhuanjl commented 4 years ago

@rhuanjl Yes, I think deprecating the old module API is the way to go.

  • Do you think we can leave out JsGetModuleRecordInfo at first?
  • Perhaps we should have separate functions for each callback type, so that the compiler can type check args?
  • We'll also need a function for setting the module error, for instance when the module file can't be read.
  1. I think you might need JsGetModuleRecordInfo in order to implement an import.meta callback. Also if we didn't have it hostDefined data on a module record would become meaningless.

  2. I personally like having as few APIs as possible within reason - there are already 4 module callbacks to set AND top level await will add a 5th - I was tempted to propose one method that sets all of them but concerned a spec change could introduce a 6th later. That said I see the advantage/simplicity of an API per callback and hence not having an enum. I don't know what's best here.

  3. We don't need a function to set a module error - JsParseModuleSource handles this if you call it with a nullptr as the sourceText parameter - this is actually what ch does.

EDIT: I don't like having normalized specifier and url as two separate properties, I've just taken a look through the code and I think they can be fairly easily combined in a largely non-breaking way so that will simplify things a little

Currently normalized specifier (which is provided when creating a module record BUT can never be retrieved by the host) is used in debug tracing AND if reporting a syntax error. URL which can be updated and retrieved is used in runtime error stack traces.

The only reason they're separate is that some of the module tests provide invalid specifiers and attempting to push those into stack traces causes an error when trying to load dynamic profile information for test cases. I created the URL property so we could have module URLs in stack traces without needing to edit those tests - which was a frankly short sited approach - so good that I should be able to reverse it without breaking anything (other than a few weird tests cases).

rhuanjl commented 4 years ago

So see linked PR for merging of URL and Specifier AND to internalise root module detection, these two changes significantly tidy up what we need to do. Updated proposed API:

  1. Module creation:

    // create a module record with the given normalizedSpecifier
    // optionally hostDefined data can be supplied which can then be accessed via JsGetModuleRecordInfo
    JsCreateModuleRecord(
    _In_opt_ JsValueRef normalizedSpecifier,
    _In_opt_ void* hostDefined,
    _Out_ JsModuleRecord moduleRecord);
  2. Getting data (e.g. for import.meta)

    // retrieve data about a module record
    JsGetModuleRecordInfo(
    _In_ JsModuleRecord moduleRecord,
    _Out_opt_ JsValueRef normalizedSpecifier,
    _Out_opt_ void* hostDefined);
  3. Setting callbacks - 5 variants of this, one for each callback (or one method with an enum OR one method that sets them all) I'm undecided.

    // set a module callback on the current context
    JsSetFetchModuleCallback(
    _In_ FetchImportedModuleCallBack* callback);

EDIT: I had previously said we'd need to specify if a module was root or not - I've now updated #6446 to internalise this detection so draft API above no along includes that.