triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend
Apache License 2.0
714 stars 108 forks source link

mT5 directory structure #279

Open hpk23 opened 11 months ago

hpk23 commented 11 months ago

I'm trying to use the mT5 model, for mT5, tensorrt-llm build creates an engine for encoder and decoder, how should I organize the directory structure in this case? (In all models, there seems to be only an example where there is only a decoder).

ensemble
 - 1
 - config.pbtxt

postprocessing
 - 1
  - model.py
 - config.pbtxt

preprocessing
 - 1
  - model.py
 - config.pbtxt

tensorrt_llm
 - 1
   - model.py
 - config.pbtxt
symphonylyh commented 10 months ago

Hi @hpk23 , we're working on a trition backend example for T5 structure. The progress is tracked under https://github.com/NVIDIA/TensorRT-LLM/issues/800. We appreciate your patience for a few more weeks as we finalize the structure