The STRING element_type has been added to C-API, but in my testing with models that expect string tensors and output them, I see incorrect results. I have a test case below comparing a working C++ case, and a failing C case. I have done some processing on the received string data as you can see in the test case below, but I'm not able to get a valid string output.
Reference - https://docs.openvino.ai/2024/openvino-workflow/running-inference/string-tensors.html
Step-by-step reproduction
Reproduction of getting string data from output of a model -
I was working with TinyLlama-1.1B-Chat-v1.0 which I got from recommended steps in optimum-cli/gen.ai repos. I'm loading an extension for both cases, I have added support for loading extensions in C-API in my open PR, so you will need to use that for C case below.
I am providing the detokenizer model with tokens extracted previously from Tinyllama model.
OpenVINO Version
2024.3.0 https://github.com/rahulchaphalkar/openvino/tree/add-extension
Operating System
Ubuntu 20.04 (LTS)
Device used for inference
CPU
Framework
None
Model used
Detokenizer.xml from TinyLlama-1.1B-Chat-v1.0
Issue description
The
STRING
element_type has been added to C-API, but in my testing with models that expectstring
tensors and output them, I see incorrect results. I have a test case below comparing a working C++ case, and a failing C case. I have done some processing on the received string data as you can see in the test case below, but I'm not able to get a valid string output. Reference - https://docs.openvino.ai/2024/openvino-workflow/running-inference/string-tensors.htmlStep-by-step reproduction
Reproduction of getting string data from output of a model - I was working with
TinyLlama-1.1B-Chat-v1.0
which I got from recommended steps in optimum-cli/gen.ai repos. I'm loading an extension for both cases, I have added support for loading extensions in C-API in my open PR, so you will need to use that for C case below. I am providing the detokenizer model with tokens extracted previously from Tinyllama model.C++ case prints this correct output
C-Case prints some unvalid utf-8.
C++/Working Case
C/C-API/ Failing Case
Relevant log output
No response
Issue submission checklist