lix19937 / tensorrt-insight

deep insight tensorrt
1 stars 0 forks source link

nonzero layer test #8

Open lix19937 opened 1 month ago

lix19937 commented 1 month ago

tensor mask op,as example:x = x[x.sum(dim=-1) > 0]
torch.where(condition), as example: torch.where(x>0)
torch.nonzero(x)

https://docs.nvidia.com/deeplearning/tensorrt/sample-support-guide/index.html#sampleNonZeroPlugin

lix19937 commented 1 month ago

对于 输入为fixed shape,而内部出现 nonzero 操作的一般认为是局部dynamic, currently skipped for dynamic shapes

个人理解:
局部区域按dynamic shape进行优化,而其他fixed shape区域进行常规优化,待验证

cudagraph 的支持可能需要修改init 逻辑