dflemstr / rq

Record Query - A tool for doing record analysis and transformation
Apache License 2.0
2.28k stars 57 forks source link

Ordered Map #209

Open tomberek opened 4 years ago

tomberek commented 4 years ago

When using rq with avro, there is a re-ordering of keys causing a validation error. A avro file and schema that works perfectly with avro-tools, fails to write with rq. Reading works fine and outputting to JSON or another format is okay, but during write, the schema validation fails because the way Maps are stored internally in rq does not maintain the original order, which is significant for some formats like Avro.

Something like:

cat data.avro | rq -aA avro.avsc
[ERROR] [rq] Encountered: Avro error
[ERROR] [rq] Caused by: validation error
[ERROR] [rq] Caused by: Decoding error: value does not match schema
[ERROR] [rq] (Re-run with --trace or RUST_BACKTRACE=1 for a backtrace)

Manually adjusting the schema to match the resulting order works. There is a TODO here that seems relevant. I attempted to use https://github.com/bluss/indexmap and making the Map a tuple of Vec and BTree hoping to store the order in Vec. I'd like to help (and learn more Rust on the way), but my rust-fu was not good enough to create a working PR.

jcaesar commented 1 year ago

Does this persist on current master? If so, do you maybe have a example avro files for testing for me?