Open jkl375 opened 8 months ago
@jkl375
@symphonylyh
I have a question about this part of the code: how does tensorrt_llm implement the mapping between input_id and the embeddings stored in prompt_table?
For example, if the length of input_id is 500, with 100 text tokens and 400 visual tokens, but the dimensions of prompt_table are [800, 5120], where 5120 is the dimension of each visual token.
How is this input_id mapped to the embedding?