flavray / avro-rs

Avro client library implementation in Rust
MIT License
169 stars 95 forks source link

how to support convert json to avro value with specified schema? #154

Open clojurians-org opened 4 years ago

clojurians-org commented 4 years ago

i want to implement the equal function for avro command line:

avro-tools fromjson --schema-file ../../data-model/schema/dc.avsc ../../data-model/sample/dc.json > dc_schema.avro

i can use [to_avro_datum] to write [Avro Value] to binary file. but i can't find a way to convert [Json Value] to [Avro Value] with specified schema.

i can simple convert the [Json Value] to [Avro Value], but it's schema is not right.

let avro_val = avro_rs::to_value(json_val).unwrap() ;

maybe the [from_avro_datum] with[ json reader] can do it, but it's binding to bytes reader

poros commented 4 years ago

I would recommend you use the Writer and Reader if you can. to_avro_datum and from_avro_datum alone are not properly compliant to the avro spec, if used in the wrong way. Nevertheless, it might be OK for your use case, as I have never used that command line tool and I know nothing about it.

The JSON you have might be complex and perhaps might not be straightforward to translate to avro. Or there could be an incompatibility with the schema. Or some feature missing in the library (unfortunately there are quite a few). Without a bit more info, it's a bit hard for me to help out more, I am afraid...

clojurians-org commented 4 years ago

I already use a c implement at work , and add enum type support which it missing. It can be switch to rust implementation easily. https://github.com/grisha/json2avro/blob/master/json2avro.c

---Original--- From: "Antonio Uccio Verardi"<notifications@github.com> Date: Mon, Aug 10, 2020 00:34 AM To: "flavray/avro-rs"<avro-rs@noreply.github.com>; Cc: "larluo"<larluo@clojurians.org>;"Author"<author@noreply.github.com>; Subject: Re: [flavray/avro-rs] how to support convert json to avro value with specified schema? (#154)

I would recommend you use the Writer and Reader if you can. to_avro_datum and from_avro_datum alone are not properly compliant to the avro spec, if used in the wrong way. Nevertheless, it might be OK for your use case, as I have never used that command line tool and I know nothing about it.

The JSON you have might be complex and perhaps might not be straightforward to translate to avro. Or there could be an incompatibility with the schema. Or some feature missing in the library (unfortunately there are quite a few). Without a bit more info, it's a bit hard for me to help out more, I am afraid...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

clojurians-org commented 4 years ago

This is avro-tools code: https://github.com/apache/avro/blob/master/lang/java/tools/src/main/java/org/apache/avro/tool/Main.java

It basically comtains two-level format. Raw Avro Fragments (low level) and avro container with schema, block and codec. For message protocol, we often use Fragment without schema as it reduce size. Javascript avro library is belong to this.

poros commented 4 years ago

I see, thanks for sharing more details. It seems a good use case for to_avro_datum, indeed. Without the error and the input data is a bit difficult to understand what is going on, but let me give you some pointers.

This is where the function you are calling is defined: https://github.com/flavray/avro-rs/blob/d841c04aa7519b3f6f6b3632c0bd583d4ceee5b9/src/ser.rs#L470 As you can see, this is pretty low level and it can fail if the avro schema is not valid as per the Avro spec.

Another option you have is using https://github.com/flavray/avro-rs/blob/master/src/types.rs#L231 , which implements a very naive conversion from JSON value to Avro value. This also could fail if you need a more clever conversion because of your avro schema.

One more option is https://serde.rs/transcode.html which could help you out with the transcoding.