SinusoidalPositionalEmbedding

System Info

Hello, I'm now in TRT - LLM adapter m2m100, but m2m100 USES is SinusoidalPositionalEmbedding, what should I do to make it work。 https://github.com/huggingface/transformers/blob/main/src/transformers/models/m2m_100/modeling_m2m_100.py#L86

Who can help?

@ncomly-nvidia

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[X] My own task or dataset (give details below)

Reproduction

In tensorrt_llm/models/enc_dec/model. The direct reference in py M2M100SinusoidalPositionalEmbedding, and in the EncDecEmbedding initialization, and then directly used in the forward, as shown below:

def forward(
        self,
        input_ids,
        position_ids=None,
        token_type_ids=None,
        prompt_embedding_table=None,
        prompt_tasks=None,
        prompt_vocab_size=None,
    ):
        # position_ids and token_type_ids are provided inputs
        # and should not be formulated determinisitically
        ptuning_args = []
        if self.use_prompt_tuning:
            ptuning_args = [prompt_embedding_table, prompt_tasks, prompt_vocab_size]
        x = self.vocab_embedding(input_ids, *ptuning_args) * self.embedding_scale
        self.register_network_output("word_embeddings", x)
        embed_pos = self.embed_positions(input_ids, x)
        embed_pos = embed_pos.to(x.device)
        hidden_states = x + embed_pos

        # if self.position_embedding:
        #     pos_emb = self.position_embedding(position_ids)
        #     self.register_network_output("position_embeddings", pos_emb)
        #     x = x + pos_emb
        # if self.token_type_embedding:
        #     x = x + self.token_type_embedding(token_type_ids)

        # if self.embedding_layernorm:
        #     x = self.embedding_layernorm(x)

        return hidden_states

Expected behavior

build successfully

actual behavior

File "/root/miniconda3/lib/python3.10/site-packages/transformers/models/m2m_100/modeling_m2m_100.py", line 162, in forward bsz, seq_len = input_ids.size() ValueError: not enough values to unpack (expected 2, got 1)

additional notes

I think I've made it clear that if you want to replicate it completely, I can upload the code I'm currently using to github

NVIDIA / TensorRT-LLM