Basically group by doesn't work when I use stream from Table method.
(I know that I can do "groupByColumn" differently but this is just an example for demo bug related to Tablesaw Stream API)
Tablesaw version: 0.43.1
Requires JDK 11
Here is Test for demo this issue:
import lombok.AllArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;
import tech.tablesaw.api.IntColumn;
import tech.tablesaw.api.Row;
import tech.tablesaw.api.StringColumn;
import tech.tablesaw.api.Table;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
@Slf4j
class TableSawGroupTest {
static List<Holder> testData = List.of(
new Holder("a", 1),
new Holder("b", 2),
new Holder("c", 3),
new Holder("a", -1)
);
@Test
void shouldGroupBy() {
Map<String, Integer> tableSawRes = tableSawVersion();
Map<String, Integer> javaRes = javaStreamVersion();
Assertions.assertEquals(javaRes, tableSawRes);
}
Map<String, Integer> tableSawVersion() {
StringColumn strColumn = StringColumn.create("A-str", testData.stream().map(p -> p.strValue).collect(Collectors.toList()));
IntColumn intColumn = IntColumn.create("B-int", testData.stream().map(p -> p.intValue).toArray(Integer[]::new));
Table table = Table.create(strColumn, intColumn);
log.info("Table: {}", table.printAll());
return table.stream()
.collect(Collectors.groupingBy(p -> p.getString("A-str"), LinkedHashMap::new,
Collectors.collectingAndThen(Collectors.toList(), rows -> {
int sum = 0;
for (Row row : rows) {
int bValue = row.getInt("B-int");
sum = sum + bValue;
}
return sum;
})));
}
Map<String, Integer> javaStreamVersion() {
return testData.stream()
.collect(Collectors.groupingBy(p -> p.strValue, LinkedHashMap::new,
Collectors.collectingAndThen(Collectors.toList(), rows -> {
int sum = 0;
for (Holder row : rows) {
int bValue = row.intValue;
sum = sum + bValue;
}
return sum;
})));
}
@AllArgsConstructor
private static class Holder {
String strValue;
int intValue;
}
}
test failed, here is output:
2023-06-17 08:50:09 INFO TableSawGroupTest:35 - Table: A-str | B-int |
-------------------
a | 1 |
b | 2 |
c | 3 |
a | -1 |
org.opentest4j.AssertionFailedError:
Expected :{a=0, b=2, c=3}
Actual :{a=-2, b=-1, c=-1}
I got issue with applying groupBy operation on specific column with using tech.tablesaw.api.Table#stream API https://javadoc.io/doc/tech.tablesaw/tablesaw-core/latest/tech/tablesaw/api/Table.html#stream-- (Returns the rows in table as a Stream)
Basically group by doesn't work when I use stream from Table method. (I know that I can do "groupByColumn" differently but this is just an example for demo bug related to Tablesaw Stream API)
Tablesaw version: 0.43.1
Requires JDK 11
Here is Test for demo this issue:
test failed, here is output:
See tablesaw-test.zip