apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
13.89k stars 3.38k forks source link

Unable to create a list of Maps with Decimal key or value #43039

Open mdesmet opened 1 week ago

mdesmet commented 1 week ago

Describe the usage question you have. Please include as many useful details as possible.

Tested on Arrow 15.0.2

I would like to create a list vector of maps. This works fine for most data types.

However when I'm creating a list of Map<Decimal, Decimal> it fails with following error:

java.lang.UnsupportedOperationException: Cannot get simple type for type MAP

    at org.apache.arrow.vector.types.Types$MinorType.getType(Types.java:807)
    at org.apache.arrow.vector.complex.impl.PromotableWriter.getWriter(PromotableWriter.java:274)
    at org.apache.arrow.vector.complex.impl.AbstractPromotableFieldWriter.getWriter(AbstractPromotableFieldWriter.java:83)
    at org.apache.arrow.vector.complex.impl.AbstractPromotableFieldWriter.startMap(AbstractPromotableFieldWriter.java:117)
    at org.apache.arrow.vector.complex.impl.PromotableWriter.startMap(PromotableWriter.java:52)
    at io.trino.plugin.hive.functions.TestUnloadArrow.testListOfMap(TestUnloadArrow.java:966)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194)
    at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
    at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
    at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
    at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
    at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
    @Test
    public void testListOfMap() {
        try (ListVector from = ListVector.empty("v", new RootAllocator())) {
            UnionListWriter listWriter = from.getWriter();
            listWriter.allocate();

            // write null, [null,{"f1":1,"f2":2},null,
            // {"f1":1,"f2":2},null] alternatively
            for (int i = 0; i < 10; i++) {
                listWriter.setPosition(i);
                if (i % 2 == 0) {
                    listWriter.writeNull();
                    continue;
                }
                listWriter.startList();
                listWriter.map().startMap();
                listWriter.map().writeNull();
                listWriter.map().startEntry();
                listWriter.map().key();
                listWriter.map().decimal().writeDecimal(BigDecimal.valueOf(2.0));
                listWriter.map().value();
                listWriter.map().decimal().writeDecimal(BigDecimal.valueOf(3.0));
                listWriter.endEntry();
                listWriter.map().endMap();
                listWriter.endList();
            }
            from.setValueCount(10);

            System.out.println(from);
        }
    }

Struct writing works fine, however the Decimal instantation requires scale and precision to be passed in.

Following test works fine.

    @Test
    public void testListOfStruct() {
        try (ListVector from = ListVector.empty("v", new RootAllocator())) {

            UnionListWriter listWriter = from.getWriter();
            listWriter.allocate();

            // write null, [null,{"f1":1,"f2":2},null,
            // {"f1":1,"f2":2},null] alternatively
            for (int i = 0; i < 10; i++) {
                listWriter.setPosition(i);
                if (i % 2 == 0) {
                    listWriter.writeNull();
                    continue;
                }
                listWriter.startList();
                listWriter.struct().writeNull();
                listWriter.struct().start();
                listWriter.struct().decimal("f1", 1, 2).writeDecimal(BigDecimal.valueOf(2.0));
                listWriter.struct().end();
                listWriter.struct().writeNull();
                listWriter.struct().start();
                listWriter.struct().decimal("f1", 1, 2).writeDecimal(BigDecimal.valueOf(3.0));
                listWriter.struct().end();
                listWriter.struct().writeNull();
                listWriter.endList();
            }
            from.setValueCount(10);

            System.out.println(from);
        }
    }

Is there any way to work around this. I have tried to specify the FieldType when creating the ListVector but this fieldType has no child types only the parent type (LIST).

Component(s)

Java

vibhatha commented 1 week ago

@mdesmet I am looking into this issue.