apache / arrow-java

Official Java implementation of Apache Arrow
https://arrow.apache.org/
Apache License 2.0
4 stars 5 forks source link

[Java] Implement converter between Arrow record batches and Avro records #404

Open asfimport opened 5 years ago

asfimport commented 5 years ago

It would be useful for applications which need convert Avro data to Arrow data.

This is an adapter which convert data with existing API (like JDBC adapter) rather than a native reader (like orc).

We implement this function through Avro java project, receiving param like Decoder/Schema/DatumReader of Avro and return VectorSchemaRoot. For each data type we have a consumer class as below to get Avro data and write it into vector to avoid boxing/unboxing (e.g. GenericRecord#get returns Object)


public class AvroIntConsumer implements Consumer {
private final IntWriter writer;

public AvroIntConsumer(IntVector vector)

{ this.writer = new IntWriterImpl(vector); }

@Override
public void consume(Decoder decoder) throws IOException

{ writer.writeInt(decoder.readInt()); writer.setPosition(writer.getPosition() + 1); }

We intended to support primitive and complex types (null value represented via unions type with null type), size limit and field selection could be optional for users. 

Reporter: Ji Liu / @tianchen92

Subtasks:

Note: This issue was originally created as ARROW-5845. Please see the migration documentation for further details.

asfimport commented 5 years ago

Ji Liu / @tianchen92: Thanks [~emkornfield@gmail.com] , I closed this umbrella issue.

asfimport commented 5 years ago

Micah Kornfield / @emkornfield: Thanks @tianchen92.  I think there is still probably room for improvement of functionality and performance.  If you are interested in still doing work in this area I can create a new set of JIRAs.

asfimport commented 5 years ago

Ji Liu / @tianchen92: [~emkornfield@gmail.com] Sure, I think you could just create JIRAs under this one and I would taking them when available, thanks. 

asfimport commented 2 years ago

Todd Farmer / @toddfarmer: This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.