Open phodal opened 1 year ago
No, you're right - nested typed are not yet supported. :( Interesting data you've got there
Thanks for share it. Any plan on it? or I just try to modifiy AnyCol.toArrowField
to implementation it ?
Honestly, i overlooked that our Arrow support misses nested types, so this improvement isn't planned. Right now the team is occupied with improvements to the documentation and notebooks experience. I think nobody is going to work on Arrow in near weeks.
You can submit a PR if you want, but apart from toArrowField
there will be modification in actual writing here: infillVector
https://github.com/Kotlin/dataframe/blob/master/dataframe-arrow/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/ArrowWriterImpl.kt
Thank you, I will try to find a solution.
IndexOutOfBoundsException: index: 31393, length: 2320 (expected: range(0, 32768))
is unexpected error, I am working on this (just got same in my project). This is because VariableWidthVector
(where String column is saved to) does not know it's actual size.
About nested types, @phodal, do you have any examples in other Java-based projects with Arrow support as an example? And what is your target Arrow schema (does it contain SructVector, ListVector or any other)?
@Kopilov Sorry, I try to do it, but it need lots of code. So, I don't use dataframe with Arrow, just keep to use JSON.
Exception is fixed in #350 Nested types are still not supported natively, should be saved correctly as strings
Hi, in my case, I want to create a arrow file in client side, then pass to server side. But when I just try run
writeArrowFeather
, will show theIndexOutOfBoundsException
issues.Here is my demo code with writer and some debug information:
When i try to debug, in the
dataFrame.schema().print()
, it will return correct schema:But, in
dataFrame.columns().toArrowSchema()
the type will be error:I lost something?