Open eric-liuyd opened 3 years ago
Hi, I'm sorry you're not seeing the functionality you expect. I agree that a parse function that splits string columns should be there and wouldn't be hard to write.
If I understand the problem, I think you can get the results you're looking for writing a simple loop:
void parse(Table df, StringColumn sc) {
DoubleColumn a = DoubleColumn.create("a", df.rowCount());
DoubleColumn b = DoubleColumn.create("b", df.rowCount());
df.addColumns(a, b);
for (Row row: df) {
String[] rats = row.getString("multi-ratios").split(",");
row.setDouble("a", Double.parseDouble(rats[0]));
row.setDouble("b", Double.parseDouble(rats[1]));
}
DoubleColumn amount = df.doubleColumn("amount");
df.addColumns(amount.multiply(a), amount.multiply(b));
}
Hi, I'm sorry you're not seeing the functionality you expect. I agree that a parse function that splits string columns should be there and wouldn't be hard to write.
If I understand the problem, I think you can get the results you're looking for writing a simple loop:
void parse(Table df, StringColumn sc) { DoubleColumn a = DoubleColumn.create("a", df.rowCount()); DoubleColumn b = DoubleColumn.create("b", df.rowCount()); df.addColumns(a, b); for (Row row: df) { String[] rats = row.getString("multi-ratios").split(","); row.setDouble("a", Double.parseDouble(rats[0])); row.setDouble("b", Double.parseDouble(rats[1])); } DoubleColumn amount = df.doubleColumn("amount"); df.addColumns(amount.multiply(a), amount.multiply(b)); }
Thanks lwhite1. But there seems to be some misunderstanding I think. The parse method is not the core request. The most important thing I consider is how to expand one row to multiple rows. For example, expand String "0.6,0.3,0.1" in one row to multiple rows, each with value "0.6", "0.3", "0.1".
Hi @eric-liuyd,
For this you can write a slightly different (double) loop:
Table df = Table.create("orig", IntColumn.create("amount"), StringColumn.create("multi_ratio"));
df.intColumn("amount").append(100).append(200);
df.stringColumn("multi_ratio").append("0.8,0.2").append("0.5,0.5");
Table df2 = Table.create("result", IntColumn.create("amount"), DoubleColumn.create("multi_ratio_single"));
for(Row row : df) {
for(String s : row.getString("multi_ratio").split(",")) {
df2.intColumn("amount").append(row.getInt("amount"));
df2.doubleColumn("multi_ratio_single").append(Double.parseDouble(s));
}
}
df2.addColumns(df2.doubleColumn("multi_ratio_single").multiply(df2.intColumn("amount")).setName("multiply_result"));
Note that this code is not production grade ...
You probably can also implement it by extracting the columns as in @lwhite1 example and do some pivoting. Not sure which is better.
Hi @eric-liuyd https://github.com/eric-liuyd,
I would look at the Table:melt() method, which implements the "tidy" melt operation, but I don't have time to confirm and write it up.
OTOH, if I were doing it myself, I might do something like what @ccleva suggests: basically copying into a new table in a loop.
Hope one of these works.
larry
On Mon, Sep 13, 2021 at 6:13 AM ccleva @.***> wrote:
Hi @eric-liuyd https://github.com/eric-liuyd,
For this you can write a slightly different (double) loop:
Table df = Table.create("orig", IntColumn.create("amount"), StringColumn.create("multi_ratio")); df.intColumn("amount").append(100).append(200); df.stringColumn("multi_ratio").append("0.8,0.2").append("0.5,0.5"); Table df2 = Table.create("result", IntColumn.create("amount"), DoubleColumn.create("multi_ratio_single")); for(Row row : df) { for(String s : row.getString("multi_ratio").split(",")) { df2.intColumn("amount").append(row.getInt("amount")); df2.doubleColumn("multi_ratio_single").append(Double.parseDouble(s)); } } df2.addColumns(df2.doubleColumn("multi_ratio_single").multiply(df2.intColumn("amount")).setName("multiply_result"));
Note that this code is not production grade ...
You probably can also implement it by extracting the columns as in @lwhite1 https://github.com/lwhite1 example and do some pivoting. Not sure which is better.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jtablesaw/tablesaw/issues/986#issuecomment-918042582, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2FPAUFQH33W73MAQHBVHTUBXFFTANCNFSM5DBABISA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hi @eric-liuyd https://github.com/eric-liuyd, I would look at the Table:melt() method, which implements the "tidy" melt operation, but I don't have time to confirm and write it up. OTOH, if I were doing it myself, I might do something like what @ccleva suggests: basically copying into a new table in a loop. Hope one of these works. larry … On Mon, Sep 13, 2021 at 6:13 AM ccleva @.***> wrote: Hi @eric-liuyd https://github.com/eric-liuyd, For this you can write a slightly different (double) loop: Table df = Table.create("orig", IntColumn.create("amount"), StringColumn.create("multi_ratio")); df.intColumn("amount").append(100).append(200); df.stringColumn("multi_ratio").append("0.8,0.2").append("0.5,0.5"); Table df2 = Table.create("result", IntColumn.create("amount"), DoubleColumn.create("multi_ratio_single")); for(Row row : df) { for(String s : row.getString("multi_ratio").split(",")) { df2.intColumn("amount").append(row.getInt("amount")); df2.doubleColumn("multi_ratio_single").append(Double.parseDouble(s)); } } df2.addColumns(df2.doubleColumn("multi_ratio_single").multiply(df2.intColumn("amount")).setName("multiply_result")); Note that this code is not production grade ... You probably can also implement it by extracting the columns as in @lwhite1 https://github.com/lwhite1 example and do some pivoting. Not sure which is better. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#986 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2FPAUFQH33W73MAQHBVHTUBXFFTANCNFSM5DBABISA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Ok. I strongly suggest you do it when you are available. Btw, could the method be implemented like Pandas DataFrame:apply() ? The apply() method can both operate on row axis and column axis.
Hi @eric-liuyd,
For this you can write a slightly different (double) loop:
Table df = Table.create("orig", IntColumn.create("amount"), StringColumn.create("multi_ratio")); df.intColumn("amount").append(100).append(200); df.stringColumn("multi_ratio").append("0.8,0.2").append("0.5,0.5"); Table df2 = Table.create("result", IntColumn.create("amount"), DoubleColumn.create("multi_ratio_single")); for(Row row : df) { for(String s : row.getString("multi_ratio").split(",")) { df2.intColumn("amount").append(row.getInt("amount")); df2.doubleColumn("multi_ratio_single").append(Double.parseDouble(s)); } } df2.addColumns(df2.doubleColumn("multi_ratio_single").multiply(df2.intColumn("amount")).setName("multiply_result"));
Note that this code is not production grade ...
You probably can also implement it by extracting the columns as in @lwhite1 example and do some pivoting. Not sure which is better.
Hi ccleva. Thanks for your code. It really works. I didn't get the use of append() before.
One of the most needed functions in data analysis is the operation among multiple columns and generation of new columns (and rows). The need can be abstracted to a unified method like
List<Column> columnOperate(List<Column>)
to accomplish inter-column operation tasks. But now, I encountered an inter-column operation problem which cannot be solved efficiently and elegantly using only a few methods. In fact I found I couldn't solve this basic need using methods given in tablesaw. Details is provided below.Let's say I have a Table named "df" with two columns "multi_ratio" and "amount".
Now I need to 1) split every value in col "multi_ratio" into multiple values (e.g., convert "0.8,0.2" to List of 0.8, 0.2) , 2) amount multi_ratio (split) (e.g., 100 List of 0.8, 0.2 -> List of 80, 20) , 3) result of 2. expanded to multiple rows.
So the final result I need would be.
To achieve this goal, I firstly make an empty copy of
df
and add empty columns.Then, I tried to operate on each row of
df
, to generate new rows and add them todf2
, but it seems just not work.Does anybody have suggests to make this happen efficiently?