Closed cmelchior closed 1 month ago
There is a problem when a FrameColumn contains frames with different schemas. I recommend attaching types to the metadata of each nested frame. This may lead to duplication if the schema of each nested frame is the same, but it will make it easier to work with on the Kotlin Notebook plugin side. We already have a lot of duplication because we pass column names for each value in rows, so this additional overhead will be minimal.
Here is the short reproducer of the problem:
dataFrameOf("a", "b")(1, dataFrameOf("c", "d")(1, 2), 2, dataFrameOf("e", "f")(1, 2))
@ermolenkodev I see your point. I forgot to think about that each row could hold different schemas for data frame references. So you are right, it is probably better to have the schema as part of the metadata inside the data frame content.
I'll refactor it.
After some discussion with @ermolenkodev we decided to rework the metadata a little. I have updated the PR and description. So it should be ready for a 2nd round of review.
This part adds the infrastructure needed for https://youtrack.jetbrains.com/issue/KTNB-693/Enable-AI-Actions-for-DataFrames-in-Kotlin-Notebooks as we currently are not able to detect column types in a good way which is needed when creating prompts for the AI Assistant.
It adds a new "types" property to the top-level "metadata" as well as recursively on each row so it is possible to easily identify column types.
A
columns
property has also been added toColumnGroup
andFrameColumn
metadata, it contains nested column names similar to the top-levelcolumns
property.Example: