jerryscript-project / jerryscript

Ultra-lightweight JavaScript engine for the Internet of Things.
https://jerryscript.net
Apache License 2.0
6.95k stars 672 forks source link

API Update proposal #4186

Open dbatyai opened 4 years ago

dbatyai commented 4 years ago

Since the last major release several new ECMAScript features have been implemented in the engine, some of these features however do not fit well with the current public API, especially proxies and modules. Several features also lack public API functions partially or even completely, and there are also inconsistencies with the naming scheme of several functions. We would like to update the public API to address these issues and hopefully make it easier to use in the process.

We're also looking for feedback to help identify other parts of the API that could use improvement as well as the the newly proposed functions. If you have an opinion on the matter please feel free to comment.

Naming scheme

We'd like to create a unified naming scheme for all API functions, where functions would be grouped by the type of the value they operate on, meaning all functions will take the jerry_<operand-type>_<operation> form. This should make the public API easier to navigate and use for embedders, and also make it easier to use code completion.

Here's an example of how the new grouping would look.

 Before:                                     After:

 jerry_get_string_size                       jerry_string_get_size

 jerry_resolve_or_reject_promise             jerry_promise_resolve_or_reject
 jerry_get_promise_result                    jerry_promise_get_result

 jerry_create_array                          jerry_array_create
 jerry_has_property                          jerry_object_has
 jerry_has_internal_property                 jerry_object_has_internal

 jerry_init_property_descriptor_fields       jerry_property_descriptor_init
 jerry_free_property_descriptor_fields       jerry_property_descriptor_free

 jerry_call_function                         jerry_function_call
 jerry_construct_object                      jerry_function_construct

 jerry_create_arraybuffer                    jerry_arraybuffer_create
 jerry_detach_arraybuffer                    jerry_arraybuffer_detach
 jerry_get_arraybuffer_pointer               jerry_arraybuffer_get_pointer
 jerry_arraybuffer_read                      jerry_arraybuffer_read

Reference counting functions

We would also like to address a minor naming issue with jerry_acquire_value and jerry_release_value. We feel like acquire and release are a poor choice of words as they can easly be misspelled. We would like to rename these to jerry_value_copy and jerry_value_free repectively (corresponding to the naming scheme defined above).

String handling

We also feel string handling could be improved in general. Accessing raw contents of jerry strings can be awkward at times, as they need to be copied to external buffers, string API functions are also a bit convoluted with all the separate functions for different encodings.

First of all we'd like to simplify encoding handling by adding a new enum. This enum can be used as a function argument to select string encoding, and will allow us to reduce the number of required string functions. This should also come in handy for other function that operate on raw strings, for example JSON (#4160).

typedef enum
{
  JERRY_ENCODING_CESU8,
  JERRY_ENCODING_UTF8,
} jerry_encoding_t;

We also propose a new string API that simplifies string handling and also allows directly accessing the contents of a jerry string, if the string type supports it:

const jerry_string_t jerry_string_get (jerry_value_t string_value, jerry_encoding_t encoding);
void jerry_string_free (const jerry_string_t *string_p);

Where jerry_string_t is a structure containing a buffer pointer to the string contents, the size and length of the string, and some additional flags.

typedef struct
{
  const uint8_t *buffer_p;
  uint32_t size;
  uint32_t length;
  uint32_t flags;
} jerry_string_t;

If the string type allows it the buffer can be used to directly access raw string data, otherwise a new buffer will be allocated automatically, which means it will no longer be necessary to allocate buffers externally.

We would also like to update how the current jerry_string_to_char_buffer, jerry_substring_to_char_buffer, and other variants work. The new function would look like the following:

size_t jerry_string_to_buffer (jerry_value_t string_value,
                               uint8_t *buffer_p,
                               size_t buffer_size,
                               jerry_encoding_t encoding);

Previously if the buffer was not sufficiently large, jerry_string_to_char_buffer would not copy anything, and instead return with the required buffer size. We would like to change this behaviour to always copy characters, and crop the string data to the external buffer size. The returned value would always be the number of bytes copied.

For substrings we feel a function which behaves similarly to the ECMAScript String.prototype.substring function would enough to handle all cases:

jerry_value_t jerry_string_substr (jerry_value_t string, uint32_t start_index, uint32_t end_index);

The resulting string can then be accessed/copied to a buffer by using the previously defined functions.

Unhandled features

There are some ECMAScript features that are lacking in native API functions, for example Date objects (#4058). We'd like to go through the API and cover all such cases, proposals for these will come later.

Source handling

Modules should be handled differently from regular Javascript sources, and they also need to have a correctly set resource name, otherwise the resolution logic can get confused. This will require additonal functions to handle module codes. Proposal will come later.

Misc

There are also previous proposals which should be considered: #2510

ossy-szeged commented 3 years ago

Ffg

@akosthekiss Please block this spammer user.

akosthekiss commented 3 years ago

@FelipeF1198 has been blocked and reported. Also removing spammy content from discussion.

0xdddddddd commented 3 years ago

It is recommended to decouple jerry_set_object_native_pointer from the third parameter "const jerry_object_native_info_t *native_info_p" of jerry_get_object_native_pointer.

Because C++ must declare a "jerry_object_native_info_t" when calling it, and it must be a static variable. If set by jerry_set_object_native_free_callback(const jerry_value_t object, jerry_object_native_free_callback_t free_cb, void* user_ptr), the flexibility will be higher.

In this way, use "jerry_get_object_native_pointer" to get class this and you can get it at will only for the time.

galpeter commented 3 years ago

Thanks for the feedback!

In the current master (not released yet) the jerry_object_native_info_t contains not just a free callback function pointer but information on additionally stored jerry_value_t entries (see example: jerry_native_pointer_init_references).

In addition the static variable (the jerry_object_native_info_t ) is also used as a unique "internal identifier" for the given native pointer (user ptr) this makes it possible to have multiple native pointers registered for a single object (with different native info structs).

For your use-case having a static variable with the native information for a given class is causing problems for you?

In this way, use "jerry_get_object_native_pointer" to get class this and you can get it at will only for the time.

I'm not sure I understand this sentence correctly. Could you please elaborate?

0xdddddddd commented 3 years ago

Thanks for the feedback!

In the current master (not released yet) the jerry_object_native_info_t contains not just a free callback function pointer but information on additionally stored jerry_value_t entries (see example: jerry_native_pointer_init_references).

In addition the static variable (the jerry_object_native_info_t ) is also used as a unique "internal identifier" for the given native pointer (user ptr) this makes it possible to have multiple native pointers registered for a single object (with different native info structs).

For your use-case having a static variable with the native information for a given class is causing problems for you?

In this way, use "jerry_get_object_native_pointer" to get class this and you can get it at will only for the time.

I'm not sure I understand this sentence correctly. Could you please elaborate?

I mean, the root mentioned above.

If decoupled jerry_set_object_native_pointer jerry_get_object_native_pointer The third parameter.

C++ will obtain this pointer more flexibly.

Therefore, it is no longer necessary to define a static variable "jerry_object_native_info_t"

Of course it can exist in this.

zherczeg commented 3 years ago

Using a statuic variable was never a requirement, it just makes your life easier if you need to manage a large amount of possible combinations. You can also use NULL as native info or can allocate it dynamically. The only rule is that it must exist until it is referenced by objects.

0xdddddddd commented 3 years ago

Using a statuic variable was never a requirement, it just makes your life easier if you need to manage a large amount of possible combinations. You can also use NULL as native info or can allocate it dynamically. The only rule is that it must exist until it is referenced by objects.

Of course I know to use NULL or nullptr

But in the case of new foo. Free callback is necessary, otherwise foo will become a wild pointer after cleanup.

gjw0813 commented 2 years ago

The new api feels like unconvenient to me, I have to create jerry_object_native_info_t to bind a pointer to a object.

I can easily do this in JerryScript 2.4 and other JS engines.

  struct test_data *native_p;
  bool has_p = jerry_get_object_native_pointer (receiver, (void *) &native_p, NULL);
dbatyai commented 2 years ago

@gjw0813 Nothing changed that would prevent you from using NULL as the native info. The native pointer functions semantically behave the same way as they did before.

The only thing that changed is the function prototypes, so that they match other property management functions. You only need to call them differently.

struct test_data my_data;
jerry_object_set_native_ptr (object, NULL, &my_data);

void *stored_ptr = jerry_object_get_native_ptr (object, NULL);
gjw0813 commented 2 years ago

Thanks so much! I misunderstand the test case

https://github.com/jerryscript-project/jerryscript/blob/9860d66a56ed44f62e1dafb9900e6f2c886a56d3/tests/unit-core/test-native-pointer.c#L149