Save graph/memory optimization information with the model

kparichay commented 2 years ago

We have three kinds of optimization being done for the models which can become expensive over time once optimization techniques are enhanced:

topological sorting: changing the sorted order can result in different peak memory case #1126
in-place layer/graph optimization: the decision making of in-place graph optimization is a simple process but will become expensive as there can be multiple configurations of choosing which group of layers should work in-place #1708
memory planner optimization: how all the memory requests can be fit into the minimum amount of total memory to minimize the peak memory usage #1127
more information will be added once more optimizations points are added

The information generated from these optimizations is as below:

The ordering of the execution of the nodes (a unique integer value per layer)
The mode of execution for the layer (a bool/enum per layer for in-place/out-of-place execution)
Offset per tensor (int/long offset to determine the tensor location in memory pool per tensor)

Information from 1. and 2. can be stored in the existing ini format while saving the model, allowing the optimization to be done offline, and skipped when loading the optimized ini file. However, storing the information from 3. requires new support.

taos-ci commented 2 years ago

:octocat: cibot: Thank you for posting issue #1709. The person in charge will reply soon.

kparichay commented 2 years ago

There are two possible options to store information 3. with the model:

Extend tflite model file format to store the offset information with the tensor. TfLite model file stores information about input and output tensors only, so this will have to be extended for temporary tensor requests (@zhoonit can comment about the feasibility here)
Ini format to include tensors information (this information will be optional as it wont be available for unoptimized ini models). The minimum amount of information this must contain will be tensor name to offset mapping. There can be mapping from layer to tensor added to each layer but will be optional (given an ini model, each tensor name can be deterministically determined and the mapping from layer to tensor will be just for error checking than creating the links between layers and tensors).

Note: the information regarding 3. can be stored in the BIN file but it is not a good option as BIN file description is to provide the model weights. Any information related to model configuration (this optimization is part of model configuration up to certain extend) is better suited to INI. Note: the information regarding 3. can also be stored in a new file format but we already have 2 file formats: INI for model architecture and configuration, and BIN for model weights. Adding a third file format is not recommended.

zhoonit commented 2 years ago

For 1. Adding tflite information to the *.fbs should do no harm technically, but once we start to put some information, it's no longer a mere mirror of the file but will deviate from the original schema anyway. We need to decide if this is acceptable before moving on.

nnstreamer / nntrainer

Save graph/memory optimization information with the model #1709