Closed ehuaa closed 7 months ago
All the zip uploaded above can be tested directly under tests/model folder
Here is document about how to debug. Hope it is helpful.
Thanks for your reply. I have checked this debug tutorial and found that after BertEmbedding layer, the result turns wrong. I wonder if embedding layer in bert does not support special position ids. (it's hard to debug with the Embedding layer cause it‘s in cpp…. @byshiue
I am not sure what do you mean for "special positioin ids". But you can print the result of embedding and check that the embedding work well or not.
I am not sure what do you mean for "special positioin ids". But you can print the result of embedding and check that the embedding work well or not.
i mark the output after added position embedding here and it's not the same as huggingface transformer bert does. And special position ids refers to i pass an user-defined position_ids to this function above, which is calculated as ,not the same as normal position_ids pre-defined in https://github.com/NVIDIA/TensorRT-LLM/blob/6837c8141acd036b8884330f4eadb50e097163f7/tensorrt_llm/models/bert/model.py#L48C11-L48C11
You can try mark the results of self.vocab_embedding
, self.position_embedding
as output to check the correctness.
@ehuaa Have you figured this out? I met the same problem.
Close this bug because it is inactivated. Please reopen it if needed.
GPU:v100 cuda version: 12.2
Thanks for your great work. Now i wanted to deploy XLMRoberta with TensorRT-LLM, which is only has a tweak from the position_ids in bert_embeddings, so follow the issue i mentioned here, https://github.com/NVIDIA/TensorRT-LLM/issues/363. @byshiue suggested me to pass position_ids as an input array to the bert forward function.
So i simply modify the original unittest file test_bert.py and pass position_ids as an input array to check if it is ok. I make 3 tests below. 1) the original unittest for test_bert.py it works well.
2) pass real data to the original unittest In this test, i just use real data to replace the generated fake data, and modify hf_bert.forward function to use attention_masks for huggingface transformers model. the core modification is here.
tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-large') sentence_pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']] device_hf = torch.device("cuda") inputs_hf = tokenizer(sentence_pairs, padding=True, truncation=True, return_tensors='pt', max_length=512).to(device_hf) and the result is error.
and the whole test_file is here. (just a test_bert.py, i cannot upload a single py file) test_bert_with_real_data.zip
3) pass user-specific position_ids as an input. 8 tests are all failed. almost 100% mismatch
the core modification is here. from transformers.models.xlm_roberta.modeling_xlm_roberta import create_position_ids_from_input_ids and the whole test_file is here. (just a test_bert.py, i cannot upload a single py file) test_bert_just_pass_position.zip
Can you help me and take a look at my problem? Looking forward to your replies, Thanks!