Closed Denghongyuan closed 1 month ago
用同一个镜像创建容器时,每次都需要重新执行脚本重新生成trt文件,用原先生成的trt文件会报错
[08/10/2024-15:52:46] [TRT] [E] 6: The engine plan file is generated on an incompatible device, expecting compute 8.6 got compute 8.9, please rebuild. [08/10/2024-15:52:46] [TRT] [E] 2: [engine.cpp::deserializeEngine::951] Error Code 2: Internal Error (Assertion engine->deserialize(start, size, allocator, runtime) failed. )
tensorrt是跟这显卡走的,看上去你换显卡了
用同一个镜像创建容器时,每次都需要重新执行脚本重新生成trt文件,用原先生成的trt文件会报错
[08/10/2024-15:52:46] [TRT] [E] 6: The engine plan file is generated on an incompatible device, expecting compute 8.6 got compute 8.9, please rebuild. [08/10/2024-15:52:46] [TRT] [E] 2: [engine.cpp::deserializeEngine::951] Error Code 2: Internal Error (Assertion engine->deserialize(start, size, allocator, runtime) failed. )
tensorrt是跟这显卡走的,看上去你换显卡了
感谢作者的回复,我确实是换显卡了,我想测试对比不同型号显卡间的生成速度;另外,我用NVIDIA GeForce RTX 4090 D运行tensorrt模式,但是生成效率不如作者你的示例视频运行的那么快,测试结果仅70ms/frame,是否有其他方面的因素会影响生成效率呢
用同一个镜像创建容器时,每次都需要重新执行脚本重新生成trt文件,用原先生成的trt文件会报错
[08/10/2024-15:52:46] [TRT] [E] 6: The engine plan file is generated on an incompatible device, expecting compute 8.6 got compute 8.9, please rebuild. [08/10/2024-15:52:46] [TRT] [E] 2: [engine.cpp::deserializeEngine::951] Error Code 2: Internal Error (Assertion engine->deserialize(start, size, allocator, runtime) failed. )
tensorrt是跟这显卡走的,看上去你换显卡了
感谢作者的回复,我确实是换显卡了,我想测试对比不同型号显卡间的生成速度;另外,我用NVIDIA GeForce RTX 4090 D运行tensorrt模式,但是生成效率不如作者你的示例视频运行的那么快,测试结果仅70ms/frame,是否有其他方面的因素会影响生成效率呢
4090我测过,不应该那么慢,大概是20ms左右,原因应该是paste_back这一步比较耗时,你可以把https://github.com/warmshao/FasterLivePortrait/blob/918f3bcdd1ad33a94cfe668bf5dd68fcf284c64f/configs/trt_infer.yaml#L92 设置为False,看看真正的速度,后面找时间把paste back改成cuda实现
用同一个镜像创建容器时,每次都需要重新执行脚本重新生成trt文件,用原先生成的trt文件会报错
[08/10/2024-15:52:46] [TRT] [E] 6: The engine plan file is generated on an incompatible device, expecting compute 8.6 got compute 8.9, please rebuild. [08/10/2024-15:52:46] [TRT] [E] 2: [engine.cpp::deserializeEngine::951] Error Code 2: Internal Error (Assertion engine->deserialize(start, size, allocator, runtime) failed. )
tensorrt是跟这显卡走的,看上去你换显卡了
感谢作者的回复,我确实是换显卡了,我想测试对比不同型号显卡间的生成速度;另外,我用NVIDIA GeForce RTX 4090 D运行tensorrt模式,但是生成效率不如作者你的示例视频运行的那么快,测试结果仅70ms/frame,是否有其他方面的因素会影响生成效率呢
4090我测过,不应该那么慢,大概是20ms左右,原因应该是paste_back这一步比较耗时,你可以把
设置为False,看看真正的速度,后面找时间把paste back改成cuda实现
速度确实有比较大的提升,速度能达到23ms/frame,可能我用的是虚拟机经过了虚拟化有些许性能损失,感谢作者的帮助
用同一个镜像创建容器时,每次都需要重新执行脚本重新生成trt文件,用原先生成的trt文件会报错
[08/10/2024-15:52:46] [TRT] [E] 6: The engine plan file is generated on an incompatible device, expecting compute 8.6 got compute 8.9, please rebuild. [08/10/2024-15:52:46] [TRT] [E] 2: [engine.cpp::deserializeEngine::951] Error Code 2: Internal Error (Assertion engine->deserialize(start, size, allocator, runtime) failed. )
tensorrt是跟这显卡走的,看上去你换显卡了
感谢作者的回复,我确实是换显卡了,我想测试对比不同型号显卡间的生成速度;另外,我用NVIDIA GeForce RTX 4090 D运行tensorrt模式,但是生成效率不如作者你的示例视频运行的那么快,测试结果仅70ms/frame,是否有其他方面的因素会影响生成效率呢
4090我测过,不应该那么慢,大概是20ms左右,原因应该是paste_back这一步比较耗时,你可以把 https://github.com/warmshao/FasterLivePortrait/blob/918f3bcdd1ad33a94cfe668bf5dd68fcf284c64f/configs/trt_infer.yaml#L92
设置为False,看看真正的速度,后面找时间把paste back改成cuda实现
速度确实有比较大的提升,速度能达到23ms/frame,可能我用的是虚拟机经过了虚拟化有些许性能损失,感谢作者的帮助
you are welcome😊
用同一个镜像创建容器时,每次都需要重新执行脚本重新生成trt文件,用原先生成的trt文件会报错
[08/10/2024-15:52:46] [TRT] [E] 6: The engine plan file is generated on an incompatible device, expecting compute 8.6 got compute 8.9, please rebuild. [08/10/2024-15:52:46] [TRT] [E] 2: [engine.cpp::deserializeEngine::951] Error Code 2: Internal Error (Assertion engine->deserialize(start, size, allocator, runtime) failed. )