Closed WA225 closed 1 month ago
@WA225, We don't have an equivalent C API yet. It is straightforward to add one. What is your use scenario?
@yufenglee thank you for your fast reply. I am using the logits to compare the output of 2 models and having that API would be very helpful.
Thanks! You can use the python API to unblock yourself for now. We will add the C/C++ API soon.
@yufenglee Okay thank you so much! I have one other question. Is there a way to modify the generator's input? I am trying to append some tokens to the output of the generator before trying to predict the next token again. Is here a way to do that without having to create a new generator from scratch?
you want something like interactive decoding and system prompt? we are working on it. it is not available now.
Fixed by #755
@WA225 can you try now? We have merged the PR.
Hi @ajindal1. Thank you for the quick fix. I tried adding the following lines to the example/c/src/main.cpp file after running the setup: OgaTensor* output_logits; CheckResult(OgaGenerator_GetOutput(generator, "logits", &output_logits)); When I try to compile with the command "cmake --build . --config Release", it gives the following error: "error C3861: 'OgaGenerator_GetOutput': identifier not found [C:\Users\onnxruntime-genai\examples\c\build\phi3.vcxproj]"
Hi @WA225 thanks for checking, can you please try it using the same way we are testing in this line.
Basically, you need to call:
generator->GetOutput("logits");
Let me know if you still face any issue.
@ajindal1 Sure, but is there a way for me to integrate in the example?
Hi @WA225, can you please share the steps to reproduce the issue on my end?
@ajindal1 Yes I described the steps I followed in this previous comment. Please let me know if you need additional information.
@WA225 Below I have mentioned what I tried and it worked for me. If you are doing something else, please share all the steps including your setup info.
# Setup
# Docker image: nvcr.io/nvidia/pytorch:24.06-py3
# Ubuntu 22.04 with 8-V100
# Steps
git clone https://github.com/microsoft/onnxruntime-genai.git
cd onnxruntime-genai
# Download ORT
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.18.0/onnxruntime-win-x64-1.18.0.zip -o onnxruntime-win-x64-1.18.0.zip
tar xvf onnxruntime-win-x64-1.18.0.zip
move onnxruntime-win-x64-1.18.0 ort
# Build GenAI
python build.py
# Modify main.cpp - Added the two lines you mentioned [here](https://github.com/microsoft/onnxruntime-genai/blob/82fdb5ee515cc763b07e4dc1f7f8d2874c506ab6/examples/c/src/main.cpp#L93)
cd build/Linux/RelWithDebInfo
cmake --build . --config Release
Here is the final output:
root@6713693867cb:/workspace/onnxruntime-genai/build/Linux/RelWithDebInfo# cmake --build . --config Release
[ 27%] Built target opencv_core
[ 45%] Built target opencv_imgproc
[ 59%] Built target libjpeg
[ 64%] Built target libpng
[ 70%] Built target opencv_imgcodecs
[ 73%] Built target noexcep_operators
[ 75%] Built target ocos_operators
[ 78%] Built target ortcustomops
[ 86%] Built target onnxruntime-genai
[ 94%] Built target onnxruntime-genai-static
[ 94%] Built target gtest
[ 95%] Built target gtest_main
[ 97%] Built target unit_tests
[ 98%] Built target python
[ 98%] Building wheel on /workspace/onnxruntime-genai/build/Linux/RelWithDebInfo/wheel
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Processing /workspace/onnxruntime-genai/build/Linux/RelWithDebInfo/wheel
Preparing metadata (setup.py) ... done
Building wheels for collected packages: onnxruntime-genai
Building wheel for onnxruntime-genai (setup.py) ... done
Created wheel for onnxruntime-genai: filename=onnxruntime_genai-0.5.0.dev0-cp310-cp310-linux_x86_64.whl size=18875958 sha256=8ebe1e2758cc66872bb028462f4a507c6cf7266670803fb7a7a808a9859e0151
Stored in directory: /tmp/pip-ephem-wheel-cache-m5r7slmj/wheels/6c/78/fb/ac7ca54985b75e1ca53a772c61923707dc66c4e47b05d77616
Successfully built onnxruntime-genai
[notice] A new release of pip is available: 24.0 -> 24.2
[notice] To update, run: python3 -m pip install --upgrade pip
[ 98%] Built target PyPackageBuild
[100%] Built target model_benchmark
Here is the git difference:
while (!OgaGenerator_IsDone(generator)) {
CheckResult(OgaGenerator_ComputeLogits(generator));
CheckResult(OgaGenerator_GenerateNextToken(generator));
+ OgaTensor* output_logits;
+ CheckResult(OgaGenerator_GetOutput(generator, "logits", &output_logits));
const int32_t num_tokens = OgaGenerator_GetSequenceCount(generator, 0);
@WA225 did you make this work?
Hello @ajindal1 and @natke , I tried running these commands on Windows, but it seems like the last cmake command is failing without an error message. You can find below the last part of the command's logs. I assumed it is failing as I do not get any "Built successfully" message. Assuming that it succeeds, where can i find the C example executable? As previously, I was compiling inside the example folder and getting the executable phi3.exe in examples/c/build/Release as described on this page.
Log: Building Custom Rule workspace/onnxruntime-genai/test/CMakeLists.txt main.cpp c_api_tests.cpp model_tests.cpp sampling_tests.cpp sampling_benchmark.cpp Generating Code... Creating library workspace/onnxruntime-genai/build/Windows/RelWithDebInfo/test/Release/unit_tests.li b and object workspace/onnxruntime-genai/build/Windows/RelWithDebInfo/test/Release/unit_tests.exp unit_tests.vcxproj -> workspace\onnxruntime-genai\build\Windows\RelWithDebInfo\test\Release\unit_tests. exe Building Custom Rule workspace/onnxruntime-genai/CMakeLists.txt
Hi @WA225, I added a call to GetOutput in the C example (currently in this PR #857). Please have a try with this and let us know how you go. The only thing you should have to build from source is the example.
Hi @natke. Thank you for your help. This works. How can I print the output_logits values?
I added commands to print out the logits to the PR: https://github.com/microsoft/onnxruntime-genai/pull/857
@WA225 #857 has been merged now. You can see the example here: https://github.com/microsoft/onnxruntime-genai/blob/main/examples/c/src/main.cpp#L45
Describe the bug Is there an equivalent C API or method to the python "logits = generator.get_output("logits")" API that allows us to get the logit values of the output?
The documentation only shows a Python API for such a purpose.