Closed WojtekPtak closed 3 years ago
ParentId part works with
// for path [1,3,6] it will return 3 - last but one element from the list
fun decodeParentId(it:ExpressionContext): List<Any?> {
return it["path"].map
and
val testDF1 = treeDF.addColumn("parent_id") {
decodeParentId(it)
}
I'm closing - I found a solution - but I'm going to add some examples to repo :)
Sorry for being so slow here. Great that you could work out a solution yourself. Feel welcome to suggest doc-additions or to raise further questions.
Hello! I'm not sure if this place is only for ideas/bugs but I have a question...
I'm wondering if it's possible to decode such data which I used recently in pandas. Its a simple tree I'm using to learn krangl but its a format which I will use from CSV files:
/ A(1) AA(2) AB(3) AC(4) AAA(5) ABA(6) ABB(7) /
How to get direct parent (last but one element from "path" but null for top parent) ? In python I was able to use lambda with row value and then sth like row['path'][-2] to get e.g 3 for ABA node. I can see there is only String for such purposes so probably I should substring "path" to have only integers and then map it to integers and then select size-2 element
Its possible for sure but I wasnt able to do it - but lets say that I have new column "parent_id" below "A", 1, "[1]", null "AA", 2, "[1,2]", 1 "AB", 3, "[1,3]", 1 "AC", 4, "[1,4]", 1 "AAA", 5, "[1,2,5]", 2 "ABA", 6, "[1,3,6]", 3 "ABB", 7, "[1,3,7]", 3
But now I need "parent_name" column. And again in Python I did it using sth like: def find_node_by_id(df, node_id): row = df.loc[df['node_id'] == node_id] return row to find row with parent definition and then I was able to obtain node_name from row. But it's a bit hard for me - maybe because I'm not very familiar with Kotlin :)
Any advice which methods I should use to work with vectors/arrays? What about grouping eg. such tree by depth: "depth", "nodes" 1, [1] 2, [2,3,4] 3, [5,6,7] I would expect vector as "nodes" value. And searching row in dataframe and use result to get parent_name... I can do it eg by filtering I guess not sure its optimal way for big data
I was searching DataFrame solution for Java - Krangl looks interesting and similar to pandas but I'm not sure if I shouldnt use Spark DataFrame instead :/