Open gszadovszky opened 10 months ago
@gszadovszky, let me review this issue and come back with a few suggestions.
Thanks @davisusanibar, in advance.
@gszadovszky Would it be possible to use MapVector instead?
import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.complex.MapVector;
import org.apache.arrow.vector.complex.impl.UnionMapWriter;
public class TestMeMapVector {
public static void main(String[] args) {
try (MapVector mapVector = MapVector.empty("map", new RootAllocator(), false)) {
mapVector.allocateNew();
UnionMapWriter mapWriter = mapVector.getWriter();
mapWriter.allocate();
mapWriter.startMap();
mapWriter.startEntry();
mapWriter.key().varChar().writeVarChar("one");
mapWriter.value().integer().writeInt(1);
mapWriter.endEntry();
mapWriter.startEntry();
mapWriter.key().varChar().writeVarChar("two");
mapWriter.value().integer().writeInt(2);
mapWriter.endEntry();
mapWriter.startEntry();
mapWriter.key().varChar().writeVarChar("three");
mapWriter.value().integer().writeInt(3);
mapWriter.endEntry();
mapWriter.writeNull();
mapWriter.startEntry();
mapWriter.key().varChar().writeVarChar("four");
mapWriter.value().integer().writeNull();
mapWriter.endEntry();
mapWriter.startEntry();
mapWriter.key().varChar().writeVarChar("five");
mapWriter.value().integer().writeInt(5);
mapWriter.endEntry();
mapWriter.endMap();
mapWriter.setValueCount(1);
System.out.println(mapVector);
// [[{"key":"one","value":1},{"key":"two","value":2},{"key":"three","value":3},null,{"key":"four"},{"key":"five","value":5}]]
}
}
}
In addition, I would like to continue reviewing if there are any changes that need to be made in order to allow a simple data type to be defined for a Map as an abstraction level of the writer's current state.
Thanks a lot, @davisusanibar for the example and the further investigation!
It seems I oversimplified my example. What I wanted to write is a vector containing lists of maps. For example:
[{"one" -> 1, "two" -> 2}, {"three" -> 3}, null, {"four" -> null, "five" -> 5}],
[{"six" -> 6}],
[null],
null,
[{"seven" -> 7}, {}, {"eight" -> null, "nine" -> 9}],
[]
Any updates on this, @davisusanibar? Do you think this is something should work or I am trying to do something unsupported?
Thanks a lot, @davisusanibar for the example and the further investigation!
It seems I oversimplified my example. What I wanted to write is a vector containing lists of maps. For example:
[{"one" -> 1, "two" -> 2}, {"three" -> 3}, null, {"four" -> null, "five" -> 5}], [{"six" -> 6}], [null], null, [{"seven" -> 7}, {}, {"eight" -> null, "nine" -> 9}], []
Hi @lidavidm How do you feel about this kind of vector? Do you think it's unsupported?
I'm not familiar enough with the writer API to say whether you can do this off the top of my head. You can always build the vector by hand (or say, build the map vector via the writer, then manually wrap it in a list vector by constructing the offsets yourself).
@lidavidm, thanks for your reply.
I am building the vectors using the writers based on a Parquet schema. Everything works fine but this type of nested data. Since there is a UnionListWriter.map(), it seems it should be supported. There are even different implementations for the different overloaded versions or map
.
Describe the usage question you have. Please include as many useful details as possible.
I'd like to store list of maps. Not sure if I am doing it wrong or there is a bug in the implementation. Here is my example:
When I execute this I'm getting the following exception:
Component(s)
Java