Closed meremeev closed 2 years ago
Hi @meremeev ,
Thanks for reaching out!
I imagine with enough work this may be possible but I'd have to investigate what changes are necessary. I haven't personally spent much time exploring this feature because most of the embedded systems use cases target static shapes.
I'd have to dig into it a bit more to get back with a meaningful answer.
Do you mind sharing your use case for dynamic shapes? I'm curious to understand the motivation for the feature.
Best, John
Hi John,
Luminar is LiDAR company. But in addition to hardware we provide software SDK. Part of SDK functionality is semantic segmentation model. Depend on scan pattern settings size of point cloud could be different. So only one possibility to support this flexibility is to have model which can handle point clouds with different sizes.
Aside from our use case, embedded system is a very large domain. It cover everything from simple/low cost/single function devices e.g. door bell with face recognition to very complex/multi-functional devices with a lot of computational resources e.g. self-driving autopilot. For such systems it is essential to have flexibility in format/size of input/sensor data.
Another factor is domain. In area of image recognition/object detection input data is usually fixed size image. But such areas as sequence analysis, voice recognition, motion detection, movie analysis, etc. have dimension for which size flexibility is very important.
So if you see torch2trt
as a universal solution to convert Torch model to TensorRT support for dynamic size is essential.
And I think something like this might work.
model_trt = torch2trt(model, [x], dynamic_sizes=[{0: (1, 10, 100)}])
nvinfer1::IPluginV2DynamicExt
interface works only with explicit batch (V2 interfaces). Right now you build implicit batch network by default.I am considering to do this changes but would like to discuss it first.
Another API which would be very useful is to serialize TensorRT engine and save it to file. This let load it to C/C++ application later. Right now we do it in little bit hacky way.
Hi @meremeev ,
Thanks for your reply, you raise some interesting use cases!
I've done some more research on what might be possible, but I'm not yet able to assess the impact of this feature / if we can safely integrate it here. Currently, I understand that some converters (ie: interpolation) will require adjustment to ensure they handle dynamic shapes appropriately. Our current test cases may not reveal this, since we use the same shape for building / testing.
Another note is that TensorRT allows for multiple optimization profiles (to cover multiple input shape ranges). This adds complexity and introduces some nuanced limitations (like INT8 calibration only applies to one profile). For your use case, do most of the tensor shapes fall within a continuous range, or multiple ranges? I'm trying to assess whether there is a tangible benefit to using multiple profiles, or if it's best to just support one profile w. multiple engines (if necessary).
Also, out of curiosity, have you explored the ONNX->TensorRT workflow for your purposes? This supports dynamic shapes, but perhaps has other limitations (which I'm interested to understand if this was the case for you).
Good point, I'm not sure yet if an API is needed for this, but we definitely need to at least add instructions for this to our documentation.
Is this the solution you used?
with open('model.engine', 'wb') as f:
f.write(model_trt.engine.serialize())
Best, John
Hi John
For my use case I need only one dynamic dimension with one range/one profile. I agree, dynamic size support is a serious rework and could be some problems to resolve.
As far as I know TensorRT do not like multiple dynamic dimensions for the same tensor. It gives performance warning.
I think idea behind multiple profiles is a way to build multiple engines from the same network. But if we convert model we always can convert it multiple times.
Our current conversion pipeline use Torch->ONNX->TensorRT
path with dynamic input size. But this path has some problems I hope to avoid by using toch2trt
. Torch -> ONNX
does not support some operations, has types restrictions, I can not parameterize custom kernels, etc.
Actually I am not sure torch2trt
support that ops either because I already have them converted to custom kernels. Something to try.
But major problem comes from ONNX
format compatibility. Torch 1.6 has ONNX IR
version 0.0.6. It is compatible with TensorRT 7 But conversion for TensorRT 6 require IR 0.0.3 It is Torch 1.2 or 1.3. So I want to find more straight conversion pass without extra layers.
Yes, we use exactly the same code to serialize TensorRT engine.
Thanks, Mark
Hi @meremeev ,
It seems like supporting just one dynamic range may be sufficient (or even preferred), and just run torch2trt multiple times if needed. The only potential downside I see is memory overhead from duplicating weights, but if this proves to be an issue it could be addressed later. I may explore this feature more soon, but I still can't make any guarantees. If you happen to experiment / discover more, I'm curious to hear.
Thanks for sharing your experience with ONNX. You might find the following helpful for your purposes
I've considered streamlining this process, which I may re-explore if it proves beneficial. For now, hopefully you find the above information helpful.
Best, John
I don't know if it is appropriate to mention it here, but depending on the set of operations you use, you might be able to do this with TRTorch.
To be more precise, it will work if your model has a UNet like architecture for which the upsampling factor is always the same (e.g. times 2).
It is a bit more opaque than this repo but works very well for traditional CNN architectures.
Best, Matthieu
Thank you! very interesting.
On Tue, Jul 27, 2021 at 3:13 AM MatthieuTPHR @.***> wrote:
I don't know if it is appropriate to mention it here, but depending on the set of operations you use, you might be able to do this with TRTorch https://github.com/NVIDIA/TRTorch/tree/master/docker.
To be more precise, it will work if your model has a UNet like architecture for which the upsampling factor is always the same (e.g. times 2).
It is a bit more opaque than this repo but works very well for traditional CNN architecture.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NVIDIA-AI-IOT/torch2trt/issues/506#issuecomment-887389105, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHC4D5BDED5Z72K5XQUQWD3TZ2BGNANCNFSM4X5BYLNA .
Hi, does torch2trt now support custom dynamic input size?
It did not at the time of this conversation. It was a while ago. Not sure about current status.
On Fri, Apr 28, 2023 at 3:50 AM jihad-akl @.***> wrote:
Hi, does torch2trt now support custom dynamic input size?
— Reply to this email directly, view it on GitHub https://github.com/NVIDIA-AI-IOT/torch2trt/issues/506#issuecomment-1527379021, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHC4D5E23WG6JJX3JKJ2GA3XDOOI7ANCNFSM4X5BYLNA . You are receiving this because you were mentioned.Message ID: @.***>
Is there any possibility to generate TensorRT engine with dynamic input size? If not, do you have any plans to provide this functionality or ideas how to approach it?