ONNC / onnc

Open Neural Network Compiler
https://onnc.ai
BSD 3-Clause "New" or "Revised" License
514 stars 92 forks source link

Custom shape inference #154

Open fahrenheitjo opened 5 years ago

fahrenheitjo commented 5 years ago

Currently I think the input and output tensors shapes are inferred through ONNX's InferShapes, but that's based on the ONNX operators memory requirements. However, the lowered target compute operators might have different memory requirements (input/output tensors with shapes different than what ONNX::InferShapes computed).

Is it possible to infer the memory requirements for specific target compute operators? Is this feature currently implemented in any ONNC backend? (sorry if this question is silly, as I'm quite new to this project).

Thank you.

tigercosmos commented 5 years ago

I am not sure if you are talking about this https://github.com/ONNC/onnc/blob/master/lib/Target/NvDla/NvDlaMemInfoPass.cpp

fahrenheitjo commented 5 years ago

I am not sure if you are talking about this https://github.com/ONNC/onnc/blob/master/lib/Target/NvDla/NvDlaMemInfoPass.cpp

@tigercosmos I'm referring to the tensor sizes that are further used for memory allocation (e.g. https://github.com/ONNC/onnc/blob/368d26a378fe4d27e3b9a32e3db0999650efd44c/lib/CodeGen/SetMemOperand.cpp#L31). How are the tensor sizes computed? I assume each IR operation has a set of input tensors and a set of output tensors and a rule of computing the their sizes. So far I found out that the tensors sizes are computed within the ONNX graph (by using InferShapes).

My question is if there is already implemented a mechanism to compute the tensor sizes for my target operators. For e.g. a convolution operator might add padding between the output maps, thus the output tensor (hence memory requirements) will be larger than the original tensor shape (computed within the ONNX graph).

a127a127 commented 5 years ago

@fahrenheitjo The padding is just the memory constraint or it will be involved in the following computation?

If is just a memory constrain, there is class called TargetMemInfo, you can check out the X86TargetMemInfo https://github.com/ONNC/onnc/blob/1c896889628eb9ae4b5f97674db4a3ff6f411c45/lib/Target/X86/TargetInfo/X86TargetMemInfo.cpp#L13-L49

The LinearScanMemAlloc will use the TargetMemInfo::getTensorMemorySize to ask Target Backend to get the real memory requirement. https://github.com/ONNC/onnc/blob/1c896889628eb9ae4b5f97674db4a3ff6f411c45/lib/CodeGen/LinearScanMemAlloc.cpp#L67

Is this suitable for you?

fahrenheitjo commented 5 years ago

@a127a127 thank you for the response. GetTensorMemorySize is too generic and doesn't take into account what operation that tensor will be used for. I actually need a way to specify the memory requirements for specific operations. For example I might have a 3x3 convolution operation optimized for various use cases which have slightly different memory requirements.

Another question is where/how is the output tensors sizes computed for a specific operation?