enso-org / dataframes

A library for working with tabular data in Luna.
https://luna-lang.org
MIT License
6 stars 5 forks source link

Ungroup splitting #134

Closed mwu-tow closed 5 years ago

mwu-tow commented 5 years ago

An utility needed for upcoming demo. Function Table.ungroupSplittingOn allows ungrouping rows by splitting strings in a given column.

E.g.

import Std.Base
import Std.Foreign.C.Value

import Dataframes.Column
import Dataframes.Types
import Dataframes.Table

def renameAt self index name:
    col = Table.columnAt index
    col2 = col.rename name

def main:
    t = Table.read "C:/Users/mwu/Downloads/games.csv"
    t2 = t.ungroupSplittingOn "tag" " "
    print t
    print t2

Basically, the purpose is to transform table like:

col1 col2
a foo bar
b foo baz
c fff

into:

col1 col2
a foo
a bar
b foo
b baz
c fff

It allows ungrouping rows by splitting over any string separator in given string column.

In a longer term this use case should be addressed in a more general way by using Arrow's lists, for now this is reasonable addition.