Open busishengui opened 1 year ago
Hi @busishengui can you explain a little bit about how you are setting the shared memory region in your client and how you are sending this information to the server?
Hi @busishengui can you explain a little bit about how you are setting the shared memory region in your client and how you are sending this information to the server?
`std::string encoder_input_shm_key = "/encoder_input" + std::to_string(threadidx); int encoder_shm_fd_ip = threadidx 100 +2; void encoder_input_shm; size_t encoder_input_byte_size = feats.size() sizeof(float) + 2 sizeof(int64_t); tc::CreateSharedMemoryRegion(encoder_input_shm_key, encoder_input_byte_size, &encoder_shm_fd_ip); tc::MapSharedMemory( encoder_shm_fd_ip, 0, encoder_input_byte_size, (void*)&encoder_input_shm); tc::CloseSharedMemory(encoder_shm_fd_ip); memcpy(encoder_input_shm, feats.data(), feats.size() sizeof(float)); memcpy(encoder_input_shm + feats.size() sizeof(float), &chunklens, sizeof(int64_t)); memcpy(encoder_input_shm + feats.size() sizeof(float) + sizeof(int64_t), &required_cache_size, sizeof(int64_t)); // LOG(INFO) << "memcpy cost time is " << asa.Elapsed(); std::string shm_input_name = "encoder_input" + std::to_string(threadidx); client->RegisterSystemSharedMemory( shm_input_name, encoder_input_shm_key, encoder_input_byte_size); encoder_input_ptr->SetSharedMemory(shm_input_name, feats.size() sizeof(float), 0); chunk_lens_ptr->SetSharedMemory(shm_input_name, sizeof(int64_t), feats.size() sizeof(float)); required_cache_size_ptr->SetSharedMemory(shm_input_name, sizeof(int64_t), feats.size() * sizeof(float) + sizeof(int64_t));
std::vector<tc::InferInput> encoder_input_list = {encoder_input_ptr.get(),
chunk_lens_ptr.get(),
required_cache_size_ptr.get()};
tc::InferRequestedOutput encoder_output;
tc::InferRequestedOutput::Create(&encoder_output, "output");
std::shared_ptr
encoder_options.sequenceend = state == DecodeState::kEndFeats ? true : false; encoder_options.requestid = std::to_string(threadidx);
client->Infer( &encoder_results, encoder_options, encoder_input_list, encoder_outputs, http_headers, tc::Parameters(), request_compression_algorithm, response_compression_algorithm);``
Description A clear and concise description of what the bug is. When I use the shared memory there is an error
Triton Information What version of Triton are you using? 22.12 Are you using the Triton container or did you build it yourself? docker To Reproduce Steps to reproduce the behavior.
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
Expected behavior A clear and concise description of what you expected to happen.