Is there a way to run Llama2 inference by providing the prompt as inputs_embeds (as allowed by the standard Llama2 forward function)? Likewise, is there an easy way of accessing the model's embeddings module, such that we can manually map input id integers to embeddings?
Is there a way to run Llama2 inference by providing the prompt as
inputs_embeds
(as allowed by the standard Llama2 forward function)? Likewise, is there an easy way of accessing the model's embeddings module, such that we can manually map input id integers to embeddings?