Open josevalim opened 2 years ago
I see two problems with the API:
Nr 1 is more about API consistency than a practical problem I think.
What do you think about a callback driven API, something like this:
{
int counter = 1;
return enif_generate_list(env, 10, callback, &counter);
}
ERL_NIF_TERM callback(ErlNifEnv* env, void* arg)
{
int* counter_ptr = (int*)arg;
return enif_make_int(env, (*counter_ptr)++);
}
This would make a lists:seq(1,10).
It's a compact API, just one function enif_generate_list
.
It could also be expanded to let the callback function decide when to terminate the list.
The drawback is the list terms must be generated sequentially from first to last. No random access like your enif_set_list
.
API wise that looks great to me, although I am unsure if this is something we could leverage from Rust. @hansihe, do you have thoughts here?
The drawback is the list terms must be generated sequentially from first to last.
If that ever becomes a need, we could also have enif_generate_reverse_list
or something.
I think an API like that would work very well with Rust iterators.
Allowing the callback to decide when the list terminates is a good idea as well, as it would enable us to build lists from iterators of unknown length. I don't know the internals on the OTP side, but if it could be beneficial, allowing the caller to specify an optional list size hint would also be an option.
Proposal:
ERL_NIF_TERM enif_generate_list(ErlNifEnv*,
ErlNifGenerateListFn* callback,
size_t length_hint);
typedef ERL_NIF_TERM ErlNifGenerateListFn(ErlNifEnv*,
ERL_NIF_TERM* end_tail,
void* arg);
To terminate the list, the callback would do
*end_tail = enif_make_list(env,0);
or set something else to append on another list or even make an improper list.
It should probably be named something else as all other term creating functions are named enif_make_*
. The other list creating functions are called enif_make_list*, so maybe
enif_make_list_generate
enif_make_list_while
A revised API with simple implementation proposal. In short, in each callback
*head
to create a new list cell.*tail
to end the list with whatever tail you want.
typedef void ErlNifMakeListWhileFn(ErlNifEnv*,
ERL_NIF_TERM* head,
ERL_NIF_TERM* tail,
void* arg);
ERL_NIF_TERM enif_make_list_while(ErlNifEnv env, ErlNifMakeListWhileFn callback, void arg, size_t length_hint) { Eterm result; Eterm prev_tailp = &result; Eterm head, tail; do { head = tail = THE_NON_VALUE; callback(env, &head, &tail, arg);
if (head != THE_NON_VALUE) {
Eterm* hp = alloc_heap(env, 2);
*prev_tailp = make_list(hp);
CAR(hp) = head;
prev_tailp = &CDR(hp);
}
} while (tail == THE_NON_VALUE);
*prev_tailp = tail;
return result;
}
Features
* Create list from head to tail without prior knowledge of list length
* Create improper lists
* Append on existing list
* Create empty lists
* Option to end the list either at creation of last cell or at next call after last cell, depending on what fit users iteration best.
* Bonus "feature": Make any term you want (by writing only to `*tail` at first call).
I thought about adding some way to abort and signal error back to caller, but the user can do that themself with `arg`.
LGTM and i think it would also work with Rust iterators, right @hansihe?
Yep, this looks like a really nice API that would fit in with Rust iterators very well.
When working with NIFs and you have to allocate large lists, we typically have two options:
enif_make_list_from_array
but that requires allocating the data into an array, which will then be discardedenif_make_list_cell
but that requires traversing from the back and allocating the cells one at a timeFor large lists (32 elements over), 1 ends up being faster because the whole list is allocated at once but it uses more memory. 2 uses less memory but ends up slower (at least on Apple M1).
It would be nice if we had a similar API to binaries, where the list is preallocated once and then we can set the list elements one by one:
The above will allocate a list of 10 elements, where each element points to an empty list (the third argument). Now, we can update the given list element with:
Eventually,
enif_make_alloc_list
orenif_release_alloc_list
needs to be invoked, similar to the binary API. The hope is that this API will put the performance closer toenif_make_list_from_array
(the cost is one extra pass setting the CARs) while avoiding allocating an intermediate array.I am not sure if this proposal makes sense, therefore feedback is welcome. :)
Is your feature request related to a problem? Please describe.
This came up here: https://github.com/rusterlium/rustler/pull/486