stdlib-js / google-summer-of-code

Google Summer of Code resources.
https://github.com/stdlib-js/stdlib
26 stars 7 forks source link

[Idea]: add support for string arrays in stdlib #44

Open kgryte opened 6 months ago

kgryte commented 6 months ago

Idea

Similar to what's described in https://github.com/stdlib-js/google-summer-of-code/issues/43, a need exists to expand array data type support beyond numeric data types. One such data type is a string data type. The rationale for having a dedicated string data type is for better interoperation between JavaScript and C, and this is particularly paramount for supporting ndarrays having a string data type, as much of ndarray iteration machinery is written in C.

Accordingly, the goal of this project is to add a dedicated string typed array called a StringArray, which will support variable-length strings. This new array type should follow a similar path to that of @stdlib/array/complex64, which provides a typed array dedicated to single-precision complex floating-point numbers; namely, StringArray should support standard typed array methods, as well as provide accessors for getting and setting array elements.

Note, however, that a StringArray should be a typed array. A StringArray should not wrap a "generic" array. Instead, the array should be backed by fixed length memory, similar to how @stdlib/array/complex64 is backed by a Float32Array. One possibility is backing StringArray instances with Node.js Buffer objects, which are, in turn, Uint8Arrays.

There are, however, some design considerations; namely, how to handle setting of array elements. In particular, what happens when a user attempts to update a StringArray element with a larger string? Does that lead to a new memory allocation and data copy? Or should elements have a fixed allocation to allow for elements to grow until some maximum size?

As part of this project, not only will a new StringArray be added to the project, but it will be integrated throughout stdlib. This will entail adding support for StringArrays wherever arrays are accepted/used, following the same precedent established by @stdlib/array/complex64 and other custom array types in stdlib. This includes adding support for string arrays in ndarray APIs.

Prior Art

Expected outcomes

The expected outcomes of this idea should be (1) creation of a new @stdlib/array/string package exposing a new typed array constructor, (2) support for StringArray instances throughout @stdlib/array/*, (3) support for StringArray instances as backing arrays for ndarrays (which may involve working with various C APIs), and (4) any other integration opportunities.

Status

While no work has been done to create a new @stdlib/array/string package, there exists prior art for adding custom typed arrays to stdlib; namely, Complex64Array and Complex128Array.

Involved software

No special software for initial work. Once work has progressed to ndarray support, will need access to a C compiler, as documented in the project development guide.

Technology

JavaScript, C, nodejs, native addons

Other technology

n/a

Difficulty

Intermediate/Advanced

Difficulty justification

This project is ambitious, as there are many design considerations which need to be addressed in order to ensure performance and allow for efficient JS/C interoperation.

Additionally, there will be difficulty beyond the creation of a new StringArray class in finding all the various bits of code throughout the project which need to be updated in order to more universally support StringArray instances throughout stdlib on equal footing with other array data types.

Prerequisite knowledge

Familiarity and comfort with JavaScript would be highly recommended, given that this project will require considerable programming in JavaScript. Some familiarity with C would also be good, especially for string array integration with ndarrays.

Project length

350hrs, as will likely involve a decent amount of R&D.

Potential mentors

@kgryte @Planeshifter

YASHSHAH-create commented 2 months ago

i won't to resolve this issue

YASHSHAH-create commented 2 months ago

please asing me this issue