ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.96k stars 9.31k forks source link

how to add an extra fixed tensor to the token embedding in gpt2 arch #9198

Open Francis235 opened 2 weeks ago

Francis235 commented 2 weeks ago

Discussed in https://github.com/ggerganov/llama.cpp/discussions/9197

Originally posted by **Francis235** August 27, 2024 Hi, I want to know how to add an extra fixed tensor to the token embedding. My model is based on gpt2, and my input format is '[mel][bos][token][eos]', I need to add my mel_embed(a fixed vector, such as 1x1600) to the embedding of '[bos][token][eos]', the python code is: ``` hidden_states = inputs_embeds + position_embeds hidden_states = torch.cat([mel_embeds, hidden_states], dim=1) ``` I try to create a tensor in build_gpt2 as the following: ``` std::string inp_mel_path = "mel-token.txt"; std::vector inp_mel_data; read_txt_to_vec(inp_mel_path, inp_mel_data); std::cout << "inp_mel_data size: " << inp_mel_data.size() << std::endl; std::cout << "inp_mel_data[0]: " << inp_mel_data[0] << std::endl; std::cout << "inp_mel_data[1599]: " << inp_mel_data[1599] << std::endl; // create tensor from vector int64_t emb_size = 1600; struct ggml_tensor * inp_mel = ggml_new_tensor_1d(ctx0, GGML_TYPE_F32, emb_size); inp_mel->data = new float[emb_size]; for (int i0 = 0; i0 < emb_size; i0++) { ((float *)inp_mel->data)[i0] = inp_mel_data[i0]; } inp_mel = ggml_cont(ctx0, ggml_reshape_4d(ctx0, inp_mel, emb_size, 1, 1, 1)); inpL = ggml_concat(ctx0, inp_mel, inpL, 1); ``` but I get the following error: llama-cli: /workspace/llama.cpp/ggml/src/ggml-backend.c:1574: ggml_backend_sched_split_graph: Assertion `src_backend_id != -1' failed. ![img_v3_02e5_52055676-00d5-49c0-bdfe-c1902e3e664g](https://github.com/user-attachments/assets/1bc80e60-37a5-4fa3-b754-2fba6c4d88c2) I think I should create the inp_me tensor in advance just like the model weight and bias created in llm_load_tensors(), but I don't know how to do that. Any suggestion? Thanks in advance.
slaren commented 2 weeks ago

This is not the correct way to allocate data for a tensor. Look into how other input tensors are created (eg. build_inp_pos), and add it to llama_set_inputs to set the data.

Francis235 commented 2 weeks ago

This is not the correct way to allocate data for a tensor. Look into how other input tensors are created (eg. build_inp_pos), and add it to llama_set_inputs to set the data.

thanks, it works.