weld-project / weld

High-performance runtime for data analytics applications
https://www.weld.rs
BSD 3-Clause "New" or "Revised" License
2.99k stars 258 forks source link

unique() function on weld-capi. #524

Open kchasialis opened 2 years ago

kchasialis commented 2 years ago

Hello,

I want to implement the grizzly_impl.unique() function using weld-capi.

After looking at the grizzly_impl.py code I found out that this is the code for unique()

map(
         tovec(
           result(
             for(
               map(
                 obj_id,
                 |p: vec[i8]| {p,0}
               ),
               dictmerger[vec[i8],i32,+],
               |b, i, e| merge(b,e)
             )
           )
         ),
         |p: {vec[i8], i32}| p.$0
       )

However, obj_id is retrieved during runtime and I do not know how to do what using weld-capi. Basically my question is how to write a unique() function using weld-ir that can be compiled and called using weld-capi.

Thanks in advance!