turian commented 3 years ago

Ask a Question

Question

I am curious if there is a common human-writable text format for specifying the architecture (untrained) of a neural network.

Further information

Relevant Area: best practices
Is this issue related to a specific model? No

Notes

If there were a concise common text format for specifying the architecture of neural networks (YAML, JSON, shudder XML, or even Python, etc), it would greatly encourage reproducibility of neural network research. For example, instead of papers including a cryptic figure and incomplete textual description, it could become the convention that everyone links to the URL of the ONNX architecture schema, which is hosted on github because it is so small.

To explain my use-case more, which I believe is quite common-place for myself, and other researchers:

You learn about a new neural network architecture, that you want to try training but with new data or with a different training regmine (because use a different platform or GPU setup).
You google and find a github repo with the model defined in some Python file in your favorite framework---or a framework you don't understand. This is surrounded by a bunch of training and inference and preprocessing code that you can't use, and a pretrained model on a different data-set than you want.
What you want is just to be able to grab the model architecture into a JSON, transpile it into your framework, and then insert it into your typically pipeline orchestration + training code.

What makes this a little tricky is that is might actually require a DSL: There might be a family of architectures that has config parameters passed in as JSON, and you might want simple for-loops and sub-graphs.

I believe a good notation would also excite the community because people know that there work would be much easier to port to other libraries.

It would definitely be my notation that this format be human-writable, not a huge text file that humans cannot easily verify are correct.

gyenesvi commented 3 years ago

Have you checked NNEF? https://www.khronos.org/nnef It has related tools and model-zoo with example models here: https://github.com/KhronosGroup/NNEF-Tools/tree/master/models

turian commented 3 years ago

@gyenesvi maybe I'm missing something but NNEF appears like ONNX? That it stores the model weights and isn't a compact human readable version of the network architecture?

gyenesvi commented 3 years ago

@turian I believe the relevant difference is that NNEF describes the network structure with a text based format (essentially a DSL) which is much more human readable and editable than ONNX's binary format. The model weights are necessarily stored in a binary format for compactness. Have you checked some examples in the model-zoo? You can see the compactness there, which is pretty similar to that of Python code in training frameworks, but only describes the network-structure in a cleaned up, standardized form. Isn't that what you are looking for?

turian commented 3 years ago

@gyenesvi Thank you

onnx / tutorials

Architecture (untrained) common format #226

Ask a Question

Question

Further information

Notes