weld-project / weld

High-performance runtime for data analytics applications
https://www.weld.rs
BSD 3-Clause "New" or "Revised" License
2.99k stars 258 forks source link

Add element access and masking support to Series #502

Closed sppalkia closed 4 years ago

sppalkia commented 4 years ago

Adds support for masking and indexing operations on GrizzlySeries. These operations differ slightly from their Pandas counterparts in that they don't use indexes -- rather, they operate more similarly to NumPy arrays for more efficient execution, and forego alignment of indexes across the vertical axis.

This behavior may change in the future if we choose to add indexes.

Examples

        >>> x = GrizzlySeries([1,2,3])
        >>> x[1]
        2
        >>> y = x + x
        >>> y[1] # Causes evaluation
        4
        >>> y = x + x
        >>> z = y[0:2]
        >>> z.evaluate()
        0    2
        1    4
        dtype: int64
        >>> y = x + x
        >>> z = y[:2]
        >>> z.evaluate()
        0    2
        1    4
        dtype: int64
        >>> x = GrizzlySeries([1,2,3,4,5])
        >>> y = x[GrizzlySeries([True, False, False, False, False])]
        >>> y.evaluate()
        0    1
        dtype: int64
        >>> y = x[x % 2 == 0]
        >>> y.evaluate()
        0    2
        1    4
        dtype: int64