NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.74k stars 1k forks source link

add support internvl2 #2394

Closed Jeremy-J-J closed 1 week ago

Jeremy-J-J commented 3 weeks ago

add support internvl2

nv-guomingz commented 3 weeks ago

Thanks @Jeremy-J-J for your contribution, I just checked the doc on examples/multimodal and it seems this kind of model had been supported yet.

Jeremy-J-J commented 3 weeks ago

Thanks @Jeremy-J-J for your contribution, I just checked the doc on examples/multimodal and it seems this kind of model had been supported yet.

It seems I don’t have permission to access this website. Has it been released already? The latest main branch doesn’t seem to have it yet, refef to main/examples/mulitmoda

nv-guomingz commented 3 weeks ago

Hi @Jeremy-J-J , this model was just introduced into our internal code base a week ago, so it's not appear on the public main branch now. I suggest you can have a try on coming weekly release and check this feature is good for your requirement.

B.T.W, I updated your message for protecting internal information.

nv-guomingz commented 2 weeks ago

https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/multimodal#internvl2 @Jeremy-J-J

Jeremy-J-J commented 2 weeks ago

https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/multimodal#internvl2 @Jeremy-J-J

I see. Good job ! @nv-guomingz

hchings commented 1 week ago

Closing this out as internvl2 has been supported.