compactr / compactr.js

Schema based serialization made easy
Other
100 stars 6 forks source link

What's the difference between Apache Avro? #5

Closed StarpTech closed 7 years ago

StarpTech commented 7 years ago

Hi I like the idea behind compactr but your goals looks very similiar to Avro. Can you explain it further?

http://avro.apache.org/docs/1.8.1/ NodeJs: https://github.com/mtth/avsc

Differences I see:

Thank you.

fed135 commented 7 years ago

I haven't personaly used Avro, but from what I can gather, these are the main differences.

Audience

Compactr seems more oriented towards realtime serialization of very small payloads, ideal for gossipping or low-bandwidth transfer of input data (such as online games). It was designed to work well with Kalm.

Avro as a whole is meant to be used as an rpc framework where schemas and their lifetimes are automatically managed. When Avro is used in RPC, the client and server exchange schemas in the connection handshake This works great for Hadoop clusters, or when working against bigdata.

Features

Compactr Plus Schemas not having to be files mean that you can dynamically generate or alter them on the fly- As a standalone serialization strategy, it allows you to manage your schemas the way you want. The output buffer is smaller in most scenarios than Avro or Protobuff and can get even smaller with the streaming methods. Compactr also offers data type coersion for dynamic-typed languages.

Compactr Minus Compactr has a few limitations right now. It does not support a whole lot of data types beyond what you find in JS. No enums, etc. No more than 255 keys per object. You have to write lots of boilerplate code to make streaming work and finally, the sometimes painful manual management of schemas.

Avro Plus It's well backed and thoroughly battle-tested. It's already implemented in a couple languages. Offers a little more recovery options in case of data corruption (data sections have a length byte and a termination byte). The rpc framework around it makes schema updates a breeze. It supports a wide array of complex data types and is designed to play nicely with Hadoop.

Avro minus Essentially another, less performant protobuf-like library when strictly used for serialization.

I'll dig deeper when I have time, but this is what I've uncovered so far. Correct me if I've assumed wrong with regards to Avro. Ill also try benchmarking against avsc

StarpTech commented 7 years ago

Hi @fed135 thank you for the precise answer. Your arguments are valid. What's the roadmap of compactr? Do you want to support a kind of schema validation?

fed135 commented 7 years ago

My pleasure, @StarpTech ! In terms of my own roadmap, I'm now focusing on slides and demos to present at local meetups, which I will be sharing for public use. I don't plan on adding features yet until I've cleaned up the Protocol Document. This is to unblock the people who've expressed interest in writing an implementation in their own language. (A Golang and a C# version might be seeing day at some point)

Feature-wise I think I only had streaming-related utils/ nitpicks. I'm super open to feature requests via tickets or pull requests, so if Schema validation is something you want to see in compactr, we could open a ticket and take the conversation there. :)