jorgecarleitao / arrow2

Transmute-free Rust library to work with the Arrow format
Apache License 2.0
1.07k stars 221 forks source link

deserialize_schema looks not working #1595

Open andreclaudino opened 7 months ago

andreclaudino commented 7 months ago

I need to translate a working python code into rust. This code receives a json payload (the event object in the following snippet), the entries are the following:

The working python snnippet is the following:

def etract_record_batches(cls, event):
    input_schema = pa.ipc.read_schema(
        pa.BufferReader(base64.b64decode(event["input_schema"]))
    )
    output_schema = pa.ipc.read_schema(
        pa.BufferReader(base64.b64decode(event["output_schema"]))
    )
    record_batch = pa.ipc.read_record_batch(
        pa.BufferReader(
            base64.b64decode(event["input_records"])
        ),
        input_schema,
    )
    record_batch_list = record_batch.to_pylist()
    return record_batch_list

How could I convert this snniped to Rust? I am still having problems in the beggining of the process, on how to extract the schema:

pub async fn extract_payload(&self) -> anyhow::Result<CustomerPayload> {
    let input_schema_bytes = base64::engine::general_purpose::STANDARD.decode(&self.input_schema)?;
    print!("input schema bytes: {:?}", input_schema_bytes);

    let (input_schema, input_ipc_schema) = arrow2::io::ipc::read::deserialize_schema(&input_schema_bytes[..])?;
    print!("input schema: {:?}", input_schema);
    print!("input ipc schema: {:?}", input_ipc_schema);

    ...

That results in the following error, while running deserialize_schema, suggesting deserialize_schema is not working properly or I am making something wrong. If I am wrong, can you help me with the correct translation into Rust using arrow2 crate?

In <Message@84>::header(): Invalid vtable length (length = 8)