Deep Learning models serialization

tiny-dnn / tiny-dnn

header only, dependency-free deep learning framework in C++14

http://tiny-dnn.readthedocs.io

Other

5.84k stars 1.39k forks source link

Deep Learning models serialization #188

Closed edgarriba closed 8 years ago

edgarriba commented 8 years ago

Hello everybody!

It's not a secret that recently the number of deep learning frameworks increased as computer scientists proved that deep models are capable to solve what some researchers say "super human" tasks. However, this significant growth is proportional to the number of different serialization formats used for each framework e.g. Caffe, Torch, Tensorflow, Matconvnet, among others.

Said that, I would like to start here a discussion about how deep learning community could join in order to standarize or define an agnostic protocol for sharing trained models easily between frameworks and avoid spaghetti converters.

Feel free to refer here developers from as many deep learning frameworks and discuss about this fact.

Thx!

edgarriba commented 8 years ago

/cc @bhack @nyanp @naibaf7 @soumith @ajtulloch @hughperkins @Yangqing

soumith commented 8 years ago

everyone converges to Caffe's prototxt. problem solved.

bhack commented 8 years ago

Extending Netflix jsongraph. Can a graph abstraction cover the majority of the cases of production nets? /cc @vrv

bhack commented 8 years ago

@karpathy What do you think of graph abstraction and json format with you past experience of DL with JavaScript.

edgarriba commented 8 years ago

@soumith that's true however IMO the drawback is that centralizing the use of Caffe's protoxt and caffemodel your are constrained under supported layers by Caffe itself. However, Caffe already supports HDF5 which I think that's a good starting point in order to have an agnostic framework protocol for sharing models. This is just a claim I received from different users when they have to deal with different trained models.

naibaf7 commented 8 years ago

@edgarriba That could soon not be a problem anymore. They are talking about decentralizing layer development, making a layer-zoo and then the IDs in the prototxt would be hashed instead of linearized, making it possible to merge prototxt versions across branches etc.

I would be strongly in favour of pushing that: https://github.com/BVLC/caffe/issues/1896

@soumith Agreed.

bhack commented 8 years ago

@naibaf7 What is the ratio of papers with released models in caffe format (I.e. we can do a quick stat on http://gitxiv.com)? I've seen some tentative to port caffe models in TF but not a port of TF models in caffe format.

bhack commented 8 years ago

@sguada was a strong team member at caffe. What do you think about differences in serialization of Model zoo and https://github.com/tensorflow/models/? There could be some path to standardize model format and params?

naibaf7 commented 8 years ago

@bhack The Caffe prototxt are a nice serialization of models that can easily be converted to other frameworks. Does TensorFlow have something like that as well? The python code is definitely not easy to use as a cross-framework model. To answer the question myself: https://www.tensorflow.org/versions/r0.8/how_tos/tool_developers/index.html

bhack commented 8 years ago

Yes see also "Freezing" section. /cc @mrry

bhack commented 8 years ago

I think that we can close this.