wanghaisheng commented 9 years ago

参考资料 1、远程通讯协议，各数据交换的协议优劣比较 2、Apache Avro 与 Thrift 比较 3、Java跨语言调用实现方案 4、Cap’n Proto is an insanely fast data interchange format and capability-based RPC system. Think JSON, except binary. Or think Protocol Buffers, except faster. In fact, in benchmarks, Cap’n Proto is INFINITY TIMES faster than Protocol Buffers. 5、[]()

三篇介绍avro优势的文章 http://radar.oreilly.com/2014/11/the-problem-of-managing-schemas.html http://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html http://blog.confluent.io/2015/02/25/stream-data-platform-2/ 数据库表结构转avro schema xml schema转avro schema 1、诺基亚的 Tool which generates Avro schemas and Java bindings from XML schemas.

wanghaisheng commented 9 years ago

远程调用方式

类型	说明	示例
轻量级	二进制序列化 + tcp协议	-------
轻量级	二进制序列化 + http协议	-------
轻量级	二进制序列化 + http协议	-------
轻量级	文本序列化 + http协议	-------
重量级	文本序列化 + http协议	-------

类型	说明	优劣	总结
二进制	Avro 与Thrift	Avro的创新之处在于融合了显式,declarative的Schema和高效二进制的数据表达，强调数据的自我描述，克服了以往单纯XML或二进制系统的缺陷。Avro对Schema动态加载功能，是Thrift编程接口所不具备的；Thrift是一个面向编程的系统, 完全依赖于IDL->Binding Language的代码生成。 Schema也“隐藏”在生成的代码中了，完全静态。为了让系统识别处理一个新的数据源，必须走编辑IDL，代码生成，编译载入的流程。与此对照，虽然Avro也支持基于IDL的Schema描述,但Avro内部Schema还是显式的,存在于JSON格式的文件当中，Avro可以把IDL格式的Schema转化成JSON格式。Avro支持2种方式。Avro-specific方式和Thrift的方式相似，依赖代码生成产生特定的类，并内嵌JSON Schema.　Avro-generic方式支持Schema的动态加载，用通用的结构(map)代表数据对象，不需要编译加载直接就可以处理新的数据源；目前阶段Thrift比Avro支持的语言更丰富.Thrift: C++, C#, Cocoa, Erlang, Haskell, Java, Ocami, Perl, PHP, Python, Ruby, Smalltalk Avro: C, C++, C#,Java, Python, Ruby, PHP.	Thrift适用于程序对程序静态的数据交换，要求schema预知并相对固定。Avro在Thrift基础上增加了对schema动态的支持且性能上不输于Thrift。 Avro显式schema设计使它更适用于搭建数据交换及存储的通用工具和平台,特别是在后台。目前Thrift的优势在于更多的语言支持和相对成熟

wanghaisheng commented 9 years ago

Apache Avro 与 Thrift 比较

Avro和Thrift都是跨语言，基于二进制的高性能的通讯中间件. 它们都提供了数据序列化的功能和RPC服务. 总体功能上类似，但是哲学不一样. Thrift出自Facebook用于后台各个服务间的通讯,Thrift的设计强调统一的编程接口的多语言通讯框架. Avro出自Hadoop之父Doug Cutting, 在Thrift已经相当流行的情况下Avro的推出，其目标不仅是提供一套类似Thrift的通讯中间件更是要建立一个新的，标准性的云计算的数据交换和存储的Protocol。这个和Thrift的理念不同，Thrift认为没有一个完美的方案可以解决所有问题，因此尽量保持一个Neutral框架，插入不同的实现并互相交互。而Avro偏向实用，排斥多种方案带来的可能的混乱，主张建立一个统一的标准，并不介意采用特定的优化。Avro的创新之处在于融合了显式,declarative的Schema和高效二进制的数据表达，强调数据的自我描述，克服了以往单纯XML或二进制系统的缺陷。Avro对Schema动态加载功能，是Thrift编程接口所不具备的，符合了Hadoop上的Hive/Pig及NOSQL 等既属于ad hoc，又追求性能的应用需求. 语言绑定

目前阶段Thrift比Avro支持的语言更丰富.

Thrift: C++, C#, Cocoa, Erlang, Haskell, Java, Ocami, Perl, PHP, Python, Ruby, Smalltalk.

Avro: C, C++, Java, Python, Ruby, PHP. 数据类型

从常见的数据类型的角度来说， Avro和Thrift非常接近，功能上并没有什么区别。 Avro Thrift
基本类型

true or false N/A 8-bit signed integer N/A I16 16-bit signed integer int I32 32-bit signed integer long I64 64-bit signed integer float N/A 32-bit floating point double double 64-bit floating point bytes binary Byte sequence string string Character sequence 复杂类型
record struct 用户自定义类型 enum enum
array list
N/A set
map<string,T> map<T1,T2> Avro map的key

必须是string union union
fixed N/A 固定大小的byte array

e.g. md5(16);

RPC服务
protocol service RPC服务类型 error exception RPC异常类型 namespace namespace 域名开发流程

从开发者角度来说，Avro和Thrift也相当类似，

1) 同一个服务分别用Avro和Thrift来描述

Avro.idl:

protocol SimpleService {

record Message {

string topic;

bytes content;

long createdTime;

string id;

string ipAddress;

map props;

}

int publish(string context,array messages);

}

Thrift.idl:

struct Message {

1: string topic

2: binary content

3: i64 createdTime

4: string id

5: string ipAddress

6: map<string,string> props

}

service SimpleService {

i32 publish(1:string context,2:list messages);

}

2) Avro和Thrift都支持IDL代码生成功能

java idl avro.idl idl.avro

java org.apache.avro.specific.SpecificCompiler idl.avro avro-gen

目标目录生成Message.java和SimpleService.java

thrift -gen java thrift.idl

同样的,目标目录生成Message.java和SimpleService.java

3) 客户端代码

Avro client :

URL url = new URL ( “http”, HOST, PORT, “/”);

Transceiver trans = new HttpTransceiver(url);

SimpleService proxy=

= (SimpleService)SpecificRequestor.getClient(SimpleService.class, transceiver);

…

Thrift client :

TTransport transport = new TFramedTransport(new TSocket(HOST,PORT));

TProtocol protocol = new TCompactProtocol(transport);

transport.open();

SimpleService.Client client = new SimpleService.Client(protocol);

…

4) 服务器端 Avro和Thrift都生成接口需要实现：

Avro server:

public static class ServiceImpl implements SimpleService {

..

}

Responder responder = new SpecificResponder(SimpleService.class, new ServiceImpl());

Server server = new HttpServer(responder, PORT);

Thrift server:

public static class ServerImpl implements SimpleService.Iface {

..

}

TServerTransport serverTransport=new TServerSocket(PORT);

TServer server=new TSimpleServer(processor,serverTransport,new TFramedTransport.Factory(), new TCompactProtocol.Factory());

server.serve(); Schema处理

Avro和Thrift处理Schema方法截然不同。

Thrift是一个面向编程的系统, 完全依赖于IDL->Binding Language的代码生成。 Schema也“隐藏”在生成的代码中了，完全静态。为了让系统识别处理一个新的数据源，必须走编辑IDL，代码生成，编译载入的流程。

与此对照，虽然Avro也支持基于IDL的Schema描述,但Avro内部Schema还是显式的,存在于JSON格式的文件当中，Avro可以把IDL格式的Schema转化成JSON格式的。

Avro支持2种方式。Avro-specific方式和Thrift的方式相似，依赖代码生成产生特定的类，并内嵌JSON Schema.　Avro-generic方式支持Schema的动态加载，用通用的结构(map)代表数据对象，不需要编译加载直接就可以处理新的数据源。

Serialization

对于序列化Avro制定了一个协议，而Thrift的设计目标是一个框架，它没有强制规定序列化的格式。

Avro规定一个标准的序列化的格式，即无论是文件存储还是网络传输，数据的Schema(in JASON)都出现在数据的前面。数据本身并不包含任何Metadata(Tag). 在文件储存的时候，schema出现在文件头中。在网络传输的时候Schema出现在初始的握手阶段.这样的好处一是使数据self describe,提高了数据的透明度和可操作性，二是减少了数据本身的信息量提高存储效率，可谓一举二得了

Avro的这种协议提供了很多优化的机会:

对数据作Projection，通过扫描schema只对感兴趣的部分作反序列化。

支持schema的versioning和mapping ,不同的版本的Reader和Writer可以通过查询schema相互交换数据(schema的aliases支持mapping),这比thrift采用的给每个域编号的方法优越多了

Avro的Schema允许定义数据的排序Order并在序列化的时候遵循这个顺序。这样话不需要反序列化就可以直接对数据进行排序,在Hadoop里很管用.

另外一个Avro的特性是采用block链表结构，突破了用单一整型表示大小的限制。比如Array或Map由一系列Block组成，每个Block包含计数器和对应的元素，计数器为0标识结束。

Thrift提供了多种序列化的实现：

TCompactProtocol: 最高效的二进制序列化协议，但并不是所有的绑定语言都支持。

TBinaryProtocol: 缺省简单二进制序列化协议.

与Avro不同，Thrift的数据存储的时候是每个Field前面都是带Tag的，这个Tag用于标识这个域的类型和顺序ID（IDL中定义，用于Versioning）。在同一批数据里面，这些Tag的信息是完全相同的，当数据条数大的时候这显然就浪费了。

RPC服务

Avro提供了

HttpServer : 缺省,基于Jetty内核的服务.

NettyServer: 新的基于Netty的服务.

Thrift提供了：

TThreadPolServer: 多线程服务

TNonBlockingServer: 单线程 non blocking的服务

THsHaServer: 多线程 non blocking的服务 Benchmarking

测试环境：2台4核 Intel Xeon 2.66GHz， 8G memory, Linux, 分别做客户端，服务器。

Object definition:

record Message {

string topic;

bytes payload;

long createdTime;

string id;

string ipAddress;

map<string,string > props;

}

Actual instance:

msg.createdTime : System.nanoTime();

msg.ipAddress : “127.0.0.1″;

msg.topic : “pv”;

msg.payload : byte[100]

msg.id : UUID.randomUUID().toString();

msg.props : new HashMap<String,String>();

msg.props.put(“author”, “tjerry”);

msg.props.put(“date”, new Date().toString());

msg.props.put(“status”, “new”);

Serialization size

Avro的序列化产生的结果最小

Serialization speed

Thrift-binary因为序列化方式简单反而看上去速度最快.

Deserialization speed

这里 Thrift的速度很快，因与它内部实现采用zero-copy的改进有关.不过在RPC综合测试里这一优势

似乎并未体现出来.

序列化测试数据采集利用了http://code.google.com/p/thrift-protobuf-compare/所提供的框架,

原始输出：

Starting

, Object create, Serialize, /w Same Object, Deserialize, and Check Media, and Check All, Total Time, Serialized Size

avro-generic , 8751.30500, 10938.00000, 1696.50000, 16825.00000, 16825.00000, 16825.00000, 27763.00000, 221

avro-specific , 8566.88000, 10534.50000, 1242.50000, 18157.00000, 18157.00000, 18157.00000, 28691.50000, 221

thrift-compact , 6784.61500, 11665.00000, 4214.00000, 1799.00000, 1799.00000, 1799.00000, 13464.00000, 227

thrift-binary , 6721.19500, 12386.50000, 4478.00000, 1692.00000, 1692.00000, 1692.00000, 14078.50000, 273

RPC测试用例：

客户端向服务器发送一组固定长度的message,为了能够同时测试序列和反序列，服务器收到后将原message返回给客户端．

array publish(string context,　array messages);

测试使用了Avro Netty Server和 Thrift HaHa Server因为他们都是基于异步IO的并且适用于高并发的环境。

结果

从这个测试来看，再未到达网络瓶颈前，Avro Netty比Thrift HsHa服务提供了更高的吞吐率和更快的响应，另外 avro占用的内存高些。

通过进一步实验,发现不存在绝对的Avro和Thrift服务哪一个更快,决定于给出的test case,或者说与程序的用法有关,比如当前测试用例是Batch模式，大量发送fine grained的对象(接近后台tt,hadoop的用法),这个情况下Avro有优势. 但是对于每次只传一个对象的chatty客户端,情况就出现逆转变成Thrift更高效了.还有当数据结构里blob比例变大的情况下,Avro和Thrift的差别也在减小. Conclusion

Thrift适用于程序对程序静态的数据交换，要求schema预知并相对固定。

Avro在Thrift基础上增加了对schema动态的支持且性能上不输于Thrift。

Avro显式schema设计使它更适用于搭建数据交换及存储的通用工具和平台,特别是在后台。

目前Thrift的优势在于更多的语言支持和相对成熟。

wanghaisheng commented 9 years ago

Java 跨语言实现方案背景：

在大型分布式 java 应用中，为了方便开发者，通常底层的 rpc 框架都会做一些调用的封装，让应用层开发人员在开发服务的时候只用编写简单的 pojo 对象就可以了，如流行的 spring remoting ， jboss remoting 等等，都有这样的效果。

随着业务的需要，可能上层应用希望采用非 java 技术，如 php ， ruby on rails ，而由于 java gc 和内存模型的限制，可能有的底层服务又需要采用更高性能和更加灵活的技术，如果 c++ ， python 等。

这时候就会考虑跨语言的问题了，在如何不改动原有 pojo 实现的 rpc 框架，而让系统实现跨语言，这个难题摆在了中间件开发者的头上。

问题 :

现在我们不妨把上面说涉及的问题提取出来：

1）不能改变原有的 java rpc 服务的发布方式，仍然采用 pojo 。

2）上层非 java 应用可以调用到由 server 端 pojo 形式发布的服务。

3）底层非 java 应用，如 c++ ， python 等可以发布格式和 pojo service 一样的服务

4）提供优雅的借口给应用开发者。业界考察：

好在我们并不是第一个遇到这个问题的人，那我们来看看在我们业界的前辈们都给我们留下了哪些宝贵的财富（主要是互联网行业）。

Google protocol buffers ： Google 大神总是早人一步，在 google 架构的初期就意识到了跨语言的重要性，在构建 bigtable ， GFS 的同一时期就是定制出了一套跨语言方案。那就是 google protocol buffers ，不过直到 08 年， google protocl buffers 才开源出来，正所谓国之利器不可以示人，我们所看到的， google protocl buffers 其实是阉割版，如没有 map 的支持 ( 根据一些资料表明， google 内部是有这个东西的) ， python 的 native c 性能优化，不包括 rpc service ，虽然后面补了一个，但是可用性差强人意，不能多参，不能抛异常。不过在这方面我们确实不应该报太大的希望，因为 google 自己都说了 protocol buffers – a language-neutral, platform-neutral, extensible way of serializing structured data ，好吧，他只是一个序列化格式，而和 hessian ， java 序列化有所不同的是， protocol buffers 可以用通过定义好数据结构的 proto （ IDL ）文件产生目标语言代码，大大了减少了开发量，不过遗憾的是生成的代码有很强的侵入性，并不能产生我们需要的pojo java 对象。

不过即使是这样，我们也从 google protocol buffers 身上学到了很多东西。

编码的压缩，采用 Base 128 Varints 序列化数字，减少网络传输开销。
非自描述数据， protocol buffers 将每个数据结构的描述信息嵌入到代码中，因此只需要传输数据过来，就可以反序列化出来该数据结构的实例了。
Immutable object ， protocol buffers 在生成的 java 代码中采用 builder&message 模式， message 是一个不能变的对象，即只有getter ，没有 setter ，而每一个 message 的生成由一个对应的 builder 来完成，从这点可以看出， google 已经用上了函数式编程。
Rpc 异步话，虽然 protocol buffers 的 rpc 很简陋，但是一开始就只提供异步 callback 调用形式，可见 google 已经实现异步话，如果在互联网行业的人会知道，这点是相当不容易。

Facebook thrift ： 4 月 1 号，呵呵，没错， thrift 是 Facebook 于 07 年愚人节开源出来的，有点 google 的作风。 Thrift 是facebook 自己的一套跨语言实现。有人会问这个和 protocol buffers 有啥区别。 Ok ，先看看它的定义吧。

Thrift is a software framework for scalable cross-language services development. It combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk, and OCaml.

说得很清楚是一个跨语言的服务开发框架。包括的功能有 code generation （代码生成， protocol buffers 也有）， cross-language （跨语言， protocol buffers 也有）， service development （好吧，这个 protocol buffers 也有）。晕倒，这样看起来，它和 google protocol buffers 完全是同一个领域的东西，而其有点重复发明轮子的味道。

刚开始，我们也有这样一个疑惑，好吧，接着往下看， here we go 。其实除了这些共同性以外（都是解决跨语言问题嘛）， thrift 还是和protocol buffers 有很大不同的。不同点如下：

1）提供一个完整的 service stack ，定义了一整套的 rpc 服务框架栈，这个 protocol buffers 是没有，这个绝对是 thrift 的利器，如果你想要开发一个服务， thrift 甚至有个栈层的实现，我靠，爽。

2） Ok ，在 thrift 论文有这样一句话。 Thrift enforces a certain messaging structure when transporting data, but it is agnostic to the protocol encoding in use. 嗯哼，我懂了，它是不会管，你到底采用哪种序列化方式的，hessian ，xml 甚至是protocol buffers 。Oh ，my god 。

3）接下来不得不膜拜一下thrift 的service 接口的强大了，多参，异常，同步，异步调用的支持，这正是我们想要的, 瞬间给protocol buffers 比下去了。

4）多集合的支持 map ， set 都有，让你爽歪歪。 Protocol buffers 颤抖吧。

这时候我们亲爱的读者就会问了，那我们的问题不就解决了吗，就是 thrift 。我笑而不语 , 虽然 thrift 是如此的强大，但是它仍然不是我们想要的， thrift 生成的代码也是强侵入性的，这样 pojo 的对象是无法发布服务的。还有一个硬伤是虽然 thrift 的 stack 很强大，当时这和我们原有系统的 stack 肯定是不兼容的，如 jboss remoting ， spring remoting ，它们都会加一些 header 信息，而 thrift 已有实现的传输中式没有header 信息的。值得一提的是现有的 thrift service 实现中，不是线程安全的，考虑到有些语言没有对线程很好的支持，尤其是 Facebook 最常用的 PHP 语言，所以现有的实现中没有线程安全 Client 的实现。这样就会造成 client 端 connection 不能复用的问题，相当于短连接了。（ ps ：其实短连接就真的比长连接性能差吗？这是个问题。）

总结一下从 Facebook thrift 学到的东西：

1）同步，异步都支持，这个很强悍，一般的做法是对性能要求高的服务器端采用异步方式开发，对易用性有要求的客户端采用同步方式调用，是比较完美的。

2）从现有的非线程安全的实现看， Facebook 很有可能自己有一套更高效的线程安全的实现，估计考虑到和 thrift 关系不到，或者是核心技术，所以没有放出来，其实想自己做，也不是太难。

3） Thrift 对很多脚本语言都进行了 native c 的性能优化，如 python 端，采用 native c 以后性能提高 20 倍。 Protocol buffers 一直在做这方面的优化，打算在 2.4 中加入，不过 protocol buffers 就像 jdk 7 一样难产，跟让人崩溃的是，前不久在论坛爆出做这块优化的哥们已经离开了 google ，不再负责了，好吧，我关心的是他去哪儿了，泪奔。

Apache Hadoop avro ： Avro is a data serialization system. Avro provides functionality similar to systems such as Thrift, Protocol Buffers, etc. 好吧它自己都承认了，我们就不去纠结了。

简单介绍一下， avo 是 hadoop 项目下面用来传输数据的一个架构。也是一个跨语言解决方案。不过 avro 有自己的亮点。 1 ， Dynamic typing， 2 ， Untagged data ， 3 ， . No manually-assigned field Ids 。

眼前一亮， Dynamic typing ， oh ， my god 。没错， avro 通过将 metadata 放在一个叫 schema 的对象里面，然后可以序列化对应的 pojo兑现。这个正是我想要的，至于其他的特性，的确没有咋仔细看 avro ，感觉上比 thrift ，和 protocol buffers 跟难学习，有熟悉的读者可以给我科普一下。解决方案：

好了，到了这里，读者大概心里也有数了， protocol buffers ， thrift ， avro 都有我们想要的和我们不想要的。要解决我们的问题，我们只需要扬长避短就可以了。揉揉就是我们的东西了。方案如下：

1）采用 protocol buffers 的 message 序列化格式和代码生成。

2）采用 thrift 的 service 生成格式，以及实现兼容 jboss remoting 或者 spring remoting 的 thrift （ jboss remoting ） stack 。

3）原有的 pojo 对象采用 avro 的 schema 方式序列化和反序列化该对象。

Ok 了，一切看起来是那样的完美。呵呵，不要被迷惑，还有很多 detail 的事情需要解决，时候不早，吃碗泡面，洗洗睡了，有时间，再把具体实现 detail 分享给大家。

wanghaisheng commented 9 years ago

When attempting to explain hypermedia, I like to use the example of navigating in a car via signposts versus a map. I realize it doesn't directly answer you question but it may help.

When driving a car and you reach a particular intersection you are provided signposts to indicate where you can go from that point. Similarly, hypermedia provides you with a set of options based on your current state.

A traditional RPC based API is more like a map. With a map you tend to plan your route out based on a static set of road data. One problem with maps is that they can become out of date and they provide no information about traffic or other dynamic factors.

The advantage of signposts is that they can be changed on the fly to detour traffic due to construction or to control traffic flow.

I'm not suggesting that signposts are always a better option than a map. Obviously there are pros and cons but it is valuable to be aware of both options. It is the same with hypermedia. It is a valuable alternative to the traditional RPC interface.

wanghaisheng commented 9 years ago

三篇介绍avro优势的文章 http://radar.oreilly.com/2014/11/the-problem-of-managing-schemas.html http://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html http://blog.confluent.io/2015/02/25/stream-data-platform-2/

wanghaisheng commented 9 years ago

数据库表结构转avro schema xml schema转avro schema 1、诺基亚的 Tool which generates Avro schemas and Java bindings from XML schemas.

wanghaisheng commented 9 years ago

Schema evolution in Avro, Protocol Buffers and Thrift

So you have some data that you want to store in a file or send over the network. You may find yourself going through several phases of evolution:

Using your programming language’s built-in serialization, such as Java serialization, Ruby’s marshal, or Python’s pickle. Or maybe you even invent your own format.
Then you realise that being locked into one programming language sucks, so you move to using a widely supported, language-agnostic format like JSON (or XML if you like to party like it’s 1999).
Then you decide that JSON is too verbose and too slow to parse, you’re annoyed that it doesn’t differentiate integers from floating point, and think that you’d quite like binary strings as well as Unicode strings. So you invent some sort of binary format that’s kinda like JSON, but binary (1, 2, 3, 4, 5, 6).
Then you find that people are stuffing all sorts of random fields into their objects, using inconsistent types, and you’d quite like a schema and some documentation, thank you very much. Perhaps you’re also using a statically typed programming language and want to generate model classes from a schema. Also you realize that your binary JSON-lookalike actually isn’t all that compact, because you’re still storing field names over and over again; hey, if you had a schema, you could avoid storing objects’ field names, and you could save some more bytes!

Once you get to the fourth stage, your options are typically Thrift, Protocol Buffers or Avro. All three provide efficient, cross-language serialization of data using a schema, and code generation for the Java folks.

Plenty of comparisons have been written about them already (1, 2, 3, 4). However, many posts overlook a detail that seems mundane at first, but is actually cruicial: What happens if the schema changes?

In real life, data is always in flux. The moment you think you have finalised a schema, someone will come up with a use case that wasn’t anticipated, and wants to “just quickly add a field”. Fortunately Thrift, Protobuf and Avro all support schema evolution: you can change the schema, you can have producers and consumers with different versions of the schema at the same time, and it all continues to work. That is an extremely valuable feature when you’re dealing with a big production system, because it allows you to update different components of the system independently, at different times, without worrying about compatibility.

Which brings us to the topic of today’s post. I would like to explore how Protocol Buffers, Avro and Thrift actually encode data into bytes — and this will also help explain how each of them deals with schema changes. The design choices made by each of the frameworks are interesting, and by comparing them I think you can become a better engineer (by a little bit).

The example I will use is a little object describing a person. In JSON I would write it like this:

    {
        "userName": "Martin",
        "favouriteNumber": 1337,
        "interests": ["daydreaming", "hacking"]
    }

This JSON encoding can be our baseline. If I remove all the whitespace it consumes 82 bytes.

Protocol Buffers

The Protocol Buffers schema for the person object might look something like this:

message Person {
    required string user_name        = 1;
    optional int64  favourite_number = 2;
    repeated string interests        = 3;
}

When we encode the data above using this schema, it uses 33 bytes, as follows:

Look exactly at how the binary representation is structured, byte by byte. The person record is just the concatentation of its fields. Each field starts with a byte that indicates its tag number (the numbers 1, 2, 3 in the schema above), and the type of the field. If the first byte of a field indicates that the field is a string, it is followed by the number of bytes in the string, and then the UTF-8 encoding of the string. If the first byte indicates that the field is an integer, a variable-length encoding of the number follows. There is no array type, but a tag number can appear multiple times to represent a multi-valued field.

This encoding has consequences for schema evolution:

There is no difference in the encoding between optional, required and repeated fields (except for the number of times the tag number can appear). This means that you can change a field from optional to repeated and vice versa (if the parser is expecting an optional field but sees the same tag number multiple times in one record, it discards all but the last value). required has an additional validation check, so if you change it, you risk runtime errors (if the sender of a message thinks that it’s optional, but the recipient thinks that it’s required).
An optional field without a value, or a repeated field with zero values, does not appear in the encoded data at all — the field with that tag number is simply absent. Thus, it is safe to remove that kind of field from the schema. However, you must never reuse the tag number for another field in future, because you may still have data stored that uses that tag for the field you deleted.
You can add a field to your record, as long as it is given a new tag number. If the Protobuf parser parser sees a tag number that is not defined in its version of the schema, it has no way of knowing what that field is called. But it does roughly know what type it is, because a 3-bit type code is included in the first byte of the field. This means that even though the parser can’t exactly interpret the field, it can figure out how many bytes it needs to skip in order to find the next field in the record.
You can rename fields, because field names don’t exist in the binary serialization, but you can never change a tag number.

This approach of using a tag number to represent each field is simple and effective. But as we’ll see in a minute, it’s not the only way of doing things.

Avro

Avro schemas can be written in two ways, either in a JSON format:

    {
        "type": "record",
        "name": "Person",
        "fields": [
            {"name": "userName",        "type": "string"},
            {"name": "favouriteNumber", "type": ["null", "long"]},
            {"name": "interests",       "type": {"type": "array", "items": "string"}}
        ]
    }

…or in an IDL:

record Person {
    string               userName;
    union { null, long } favouriteNumber;
    array<string>        interests;
}

Notice that there are no tag numbers in the schema! So how does it work?

Here is the same example data encoded in just 32 bytes:

Strings are just a length prefix followed by UTF-8 bytes, but there’s nothing in the bytestream that tells you that it is a string. It could just as well be a variable-length integer, or something else entirely. The only way you can parse this binary data is by reading it alongside the schema, and the schema tells you what type to expect next. You need to have the exact same version of the schema as the writer of the data used. If you have the wrong schema, the parser will not be able to make head or tail of the binary data.

So how does Avro support schema evolution? Well, although you need to know the exact schema with which the data was written (the writer’s schema), that doesn’t have to be the same as the schema the consumer is expecting (the reader’s schema). You can actually give two different schemas to the Avro parser, and it uses resolution rules to translate data from the writer schema into the reader schema.

This has some interesting consequences for schema evolution:

The Avro encoding doesn’t have an indicator to say which field is next; it just encodes one field after another, in the order they appear in the schema. Since there is no way for the parser to know that a field has been skipped, there is no such thing as an optional field in Avro. Instead, if you want to be able to leave out a value, you can use a union type, like union { null, long } above. This is encoded as a byte to tell the parser which of the possible union types to use, followed by the value itself. By making a union with the null type (which is simply encoded as zero bytes) you can make a field optional.
Union types are powerful, but you must take care when changing them. If you want to add a type to a union, you first need to update all readers with the new schema, so that they know what to expect. Only once all readers are updated, the writers may start putting this new type in the records they generate.
You can reorder fields in a record however you like. Although the fields are encoded in the order they are declared, the parser matches fields in the reader and writer schema by name, which is why no tag numbers are needed in Avro.
Because fields are matched by name, changing the name of a field is tricky. You need to first update all readers of the data to use the new field name, while keeping the old name as an alias (since the name matching uses aliases from the reader’s schema). Then you can update the writer’s schema to use the new field name.
You can add a field to a record, provided that you also give it a default value (e.g. null if the field’s type is a union with null). The default is necessary so that when a reader using the new schema parses a record written with the old schema (and hence lacking the field), it can fill in the default instead.
Conversely, you can remove a field from a record, provided that it previously had a default value. (This is a good reason to give all your fields default values if possible.) This is so that when a reader using the old schema parses a record written with the new schema, it can fall back to the default.

This leaves us with the problem of knowing the exact schema with which a given record was written. The best solution depends on the context in which your data is being used:

In Hadoop you typically have large files containing millions of records, all encoded with the same schema. Object container files handle this case: they just include the schema once at the beginning of the file, and the rest of the file can be decoded with that schema.
In an RPC context, it’s probably too much overhead to send the schema with every request and response. But if your RPC framework uses long-lived connections, it can negotiate the schema once at the start of the connection, and amortize that overhead over many requests.
If you’re storing records in a database one-by-one, you may end up with different schema versions written at different times, and so you have to annotate each record with its schema version. If storing the schema itself is too much overhead, you can use a hash of the schema, or a sequential schema version number. You then need a schema registry where you can look up the exact schema definition for a given version number.

One way of looking at it: in Protocol Buffers, every field in a record is tagged, whereas in Avro, the entire record, file or network connection is tagged with a schema version.

At first glance it may seem that Avro’s approach suffers from greater complexity, because you need to go to the additional effort of distributing schemas. However, I am beginning to think that Avro’s approach also has some distinct advantages:

Object container files are wonderfully self-describing: the writer schema embedded in the file contains all the field names and types, and even documentation strings (if the author of the schema bothered to write some). This means you can load these files directly into interactive tools like Pig, and it Just Works™ without any configuration.
As Avro schemas are JSON, you can add your own metadata to them, e.g. describing application-level semantics for a field. And as you distribute schemas, that metadata automatically gets distributed too.
A schema registry is probably a good thing in any case, serving as documentation and helping you to find and reuse data. And because you simply can’t parse Avro data without the schema, the schema registry is guaranteed to be up-to-date. Of course you can set up a protobuf schema registry too, but since it’s not required for operation, it’ll end up being on a best-effort basis.

Thrift

Thrift is a much bigger project than Avro or Protocol Buffers, as it’s not just a data serialization library, but also an entire RPC framework. It also has a somewhat different culture: whereas Avro and Protobuf standardize a single binary encoding, Thrift embraces a whole variety of different serialization formats (which it calls “protocols”).

Indeed, Thrift has two different JSON encodings, and no fewer than three different binary encodings. (However, one of the binary encodings, DenseProtocol, is only supported in the C++ implementation; since we’re interested in cross-language serialization, I will focus on the other two.)

All the encodings share the same schema definition, in Thrift IDL:

struct Person {
  1: string       userName,
  2: optional i64 favouriteNumber,
  3: list<string> interests
}

The BinaryProtocol encoding is very straightforward, but also fairly wasteful (it takes 59 bytes to encode our example record):

The CompactProtocol encoding is semantically equivalent, but uses variable-length integers and bit packing to reduce the size to 34 bytes:

As you can see, Thrift’s approach to schema evolution is the same as Protobuf’s: each field is manually assigned a tag in the IDL, and the tags and field types are stored in the binary encoding, which enables the parser to skip unknown fields. Thrift defines an explicit list type rather than Protobuf’s repeated field approach, but otherwise the two are very similar.

In terms of philosophy, the libraries are very different though. Thrift favours the “one-stop shop” style that gives you an entire integrated RPC framework and many choices (with varying cross-language support), whereas Protocol Buffers and Avro appear to follow much more of a “do one thing and do it well” style.

This post has been translated into Korean by Justin Song.

## Recent posts * 13 Apr 2015: [Real-time full-text search with Luwak and Samza](/2015/04/13/real-time-full-text-search-luwak-samza.html) * 04 Mar 2015: [Turning the database inside-out with Apache Samza](/2015/03/04/turning-the-database-inside-out.html) * 29 Jan 2015: [Stream processing, Event sourcing, Reactive, CEP… and making sense of it all](/2015/01/29/stream-processing-event-sourcing-reactive-cep.html) * 10 Dec 2014: [Wouldn’t it be fun to build your own Google?](/2014/12/10/build-your-own-google.html) * 25 Nov 2014: [Hermitage: Testing the “I” in ACID](/2014/11/25/hermitage-testing-the-i-in-acid.html) * [Full archive](/archive.html)

wanghaisheng / wanghaisheng.github.io

服务间通讯方式远程通讯协议的对比分析 #33

Schema evolution in Avro, Protocol Buffers and Thrift

Protocol Buffers

Avro

Thrift

wanghaisheng / wanghaisheng.github.io

服务间通讯方式 远程通讯协议的对比分析 #33

Schema evolution in Avro, Protocol Buffers and Thrift

Protocol Buffers

Avro

Thrift

服务间通讯方式远程通讯协议的对比分析 #33